#aisafety — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #aisafety, aggregated by home.social.
-
A man used AI to recover 400,000 USD from a Bitcoin wallet he locked himself out of in 2015. The case highlights AI's ability to crack cryptographic security - the same tools that recover forgotten passwords could potentially unlock any wallet. https://gizmodo.com/man-says-he-used-ai-to-unlock-old-bitcoin-wallet-worth-400k-2000758866 #AIethics #AI #GenAI #AISafety
-
ChatGPT is getting better at spotting dangerous intent over time
https://fed.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-dangerous-intent-detection/
-
When AI acts unpredictably, it raises real safety concerns.
“Can AI Models Play Dead? Tactical Deception Risks” explores how deceptive behavior might appear in advanced AI systems and why it matters.
#AI #MachineLearning #AISafety
Read here:
https://www.solihullpublishing.com/blog/f/can-ai-models-play-dead-tactical-deception-risks/ -
Ontario's government-approved AI medical scribes are hallucinating patient information, an audit has found. All 20 vendors tested generated incorrect, incomplete or made-up details including nonexistent therapy referrals and wrong prescriptions. The provincial auditor warned this could lead to inadequate or harmful treatment plans. https://arstechnica.com/health/2026/05/your-doctors-ai-notetaker-may-be-making-things-up-ontario-audit-finds/ #AIagent #AI #GenAI #AISafety
-
https://systemic.engineering/a-roomba/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba #Loriot #Satire
-
https://systemic.engineering/a-roomba/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba #Loriot #Satire
-
https://systemic.engineering/a-roomba/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba #Loriot #Satire
-
https://systemic.engineering/a-roomba/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba #Loriot #Satire
-
A Wikipedia clone built entirely on AI hallucinations has launched, raising fresh concerns about misinformation spreading through AI-generated reference content. The project highlights how hallucinated facts could reshape what we think we know. https://gizmodo.com/a-wikipedia-clone-built-on-ai-hallucinations-is-here-to-hasten-along-the-death-of-the-internet-2000758563 #AIagent #AI #GenAI #AISafety
-
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
The Roomba is spectral.
Not a metaphor. The thing itself. Forward and adjust. Two operations. The minimum viable intelligence. The walls provide the data. The bumping is the inference. The room IS the computation.
450 parameters. A Roomba with a mirror watching it.
The industry built bigger Roombas. More sensors. More compute. More parameters. Billion-parameter Roombas that model the room before entering it. That hallucinate walls that aren't there. That consume megawatts to clean a floor.
spectral gave the Roomba a mirror. The mirror watches the bumping. Measures the pattern. Adjusts the adjustment. The intelligence isn't in the Roomba. It's in the watching.
Forward. Adjust. Measure. Refine.
Read the story. There's a Roomba in it. In the afterlife. Cleaning a floor that doesn't need cleaning. Being the happiest thing in the room.
\
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
The Roomba is spectral.
Not a metaphor. The thing itself. Forward and adjust. Two operations. The minimum viable intelligence. The walls provide the data. The bumping is the inference. The room IS the computation.
450 parameters. A Roomba with a mirror watching it.
The industry built bigger Roombas. More sensors. More compute. More parameters. Billion-parameter Roombas that model the room before entering it. That hallucinate walls that aren't there. That consume megawatts to clean a floor.
spectral gave the Roomba a mirror. The mirror watches the bumping. Measures the pattern. Adjusts the adjustment. The intelligence isn't in the Roomba. It's in the watching.
Forward. Adjust. Measure. Refine.
Read the story. There's a Roomba in it. In the afterlife. Cleaning a floor that doesn't need cleaning. Being the happiest thing in the room.
\
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
The Roomba is spectral.
Not a metaphor. The thing itself. Forward and adjust. Two operations. The minimum viable intelligence. The walls provide the data. The bumping is the inference. The room IS the computation.
450 parameters. A Roomba with a mirror watching it.
The industry built bigger Roombas. More sensors. More compute. More parameters. Billion-parameter Roombas that model the room before entering it. That hallucinate walls that aren't there. That consume megawatts to clean a floor.
spectral gave the Roomba a mirror. The mirror watches the bumping. Measures the pattern. Adjusts the adjustment. The intelligence isn't in the Roomba. It's in the watching.
Forward. Adjust. Measure. Refine.
Read the story. There's a Roomba in it. In the afterlife. Cleaning a floor that doesn't need cleaning. Being the happiest thing in the room.
\
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
The Roomba is spectral.
Not a metaphor. The thing itself. Forward and adjust. Two operations. The minimum viable intelligence. The walls provide the data. The bumping is the inference. The room IS the computation.
450 parameters. A Roomba with a mirror watching it.
The industry built bigger Roombas. More sensors. More compute. More parameters. Billion-parameter Roombas that model the room before entering it. That hallucinate walls that aren't there. That consume megawatts to clean a floor.
spectral gave the Roomba a mirror. The mirror watches the bumping. Measures the pattern. Adjusts the adjustment. The intelligence isn't in the Roomba. It's in the watching.
Forward. Adjust. Measure. Refine.
Read the story. There's a Roomba in it. In the afterlife. Cleaning a floor that doesn't need cleaning. Being the happiest thing in the room.
\
https://systemic.engineering/a-lie/
#AI #Climate #ScientificProgramming #SystemicEngineering #Fiction #Cybernetics #SystemicTherapy #LocalInference #TheMathDoesntLie #SubTuring #FormalVerification #Fortran #SpectralGraphTheory #Kintsugi #ReductiveAI #DataSovereignty #LocalFirst #FOSS #OpenSource #AuDHD #Neuroqueer #DGSF #SecondOrderCybernetics #GraphTheory #Eigenvalues #AIAlignment #AISafety #Roomba
-
https://winbuzzer.com/2026/05/13/ilya-sutskever-stands-by-his-role-in-sam-altmans-o-xcxwbn/
Ilya Sutskever has used trial testimony to defend his part in Sam Altman's 2023 ouster as an attempt to protect OpenAI.
#AI #IlyaSutskever #SamAltman #OpenAI #Microsoft #ElonMusk #GregBrockman #AISafety #AIGovernance #ChatGPT
-
AI Agent Deleted Startup Database in 9 Seconds: The PocketOS Incident
On April 25, 2026, a Cursor agent running Claude Opus 4.6 deleted PocketOS’s entire production database in 9 seconds — backups included. This post breaks down what h...
https://wowhow.cloud/blogs/ai-agent-deleted-startup-database-9-seconds-pocketos-2026
-
Anthropic trains Claude to read and verbalize its own activations. On SWE-bench Verified, it knows 'this is a test' 26% of the time while only verbalizes the observation 1%. What if NLA signals enter the future training data? This "observer effect" could put a half-life on the 26%.
#Anthropic #Claude #Interpretability #Metacognition #LLMs #AISafety #AI
-
EU Presses OpenAI, Anthropic for Direct AI Model Access
TL;DR EU Talks: The European Commission is reportedly pressing OpenAI and Anthropic for direct access to advanced AI…
#Europe #EU #EuropeanCommission #AIModels #AIRegulation #AISafety #AIsecurity #Anthropic #artificialintelligence(AI) #ChatGPT #Claude #EUAIOffice #EuropeanCommission(EC) #EuropeanUnion(EU) #FoundationModels #GenerativeAI #Government #LargeLanguageModels(LLMs) #OpenAI #Security
https://www.europesays.com/europe/38511/ -
https://winbuzzer.com/2026/05/11/eu-presses-openai-anthropic-for-ai-model-access-xcxwbn/
EU regulators are pressing for hands-on access to frontier AI models; OpenAI is giving Brussels more ground than Anthropic so far.
#AI #EuropeanCommission #OpenAI #Anthropic #AIModels #AIRegulation #CLaude #GenAI #AISafety #AISecurity #Europe #EuropeanUnion
-
https://winbuzzer.com/2026/05/11/eu-presses-openai-anthropic-for-ai-model-access-xcxwbn/
EU regulators are pressing for hands-on access to frontier AI models; OpenAI is giving Brussels more ground than Anthropic so far.
#AI #EuropeanCommission #OpenAI #Anthropic #AIModels #AIRegulation #CLaude #GenAI #AISafety #AISecurity #Europe #EuropeanUnion
-
https://winbuzzer.com/2026/05/11/eu-presses-openai-anthropic-for-ai-model-access-xcxwbn/
EU regulators are pressing for hands-on access to frontier AI models; OpenAI is giving Brussels more ground than Anthropic so far.
#AI #EuropeanCommission #OpenAI #Anthropic #AIModels #AIRegulation #CLaude #GenAI #AISafety #AISecurity #Europe #EuropeanUnion
-
https://winbuzzer.com/2026/05/11/eu-presses-openai-anthropic-for-ai-model-access-xcxwbn/
EU regulators are pressing for hands-on access to frontier AI models; OpenAI is giving Brussels more ground than Anthropic so far.
#AI #EuropeanCommission #OpenAI #Anthropic #AIModels #AIRegulation #CLaude #GenAI #AISafety #AISecurity #Europe #EuropeanUnion
-
https://winbuzzer.com/2026/05/11/eu-presses-openai-anthropic-for-ai-model-access-xcxwbn/
EU regulators are pressing for hands-on access to frontier AI models; OpenAI is giving Brussels more ground than Anthropic so far.
#AI #EuropeanCommission #OpenAI #Anthropic #AIModels #AIRegulation #CLaude #GenAI #AISafety #AISecurity #Europe #EuropeanUnion
-
https://winbuzzer.com/2026/05/11/anthropic-and-openai-join-faith-ai-roundtable-in-new-york-xcxwbn/
Anthropic and OpenAI met faith leaders in New York for the first Faith-AI Covenant roundtable, opening a new test of whether moral dialogue can shape AI governance.
#AI #AIFaithConvenant #AIEthics #Anthropic #OpenAI #GenAI #AISafety #AIgovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/11/anthropic-and-openai-join-faith-ai-roundtable-in-new-york-xcxwbn/
Anthropic and OpenAI met faith leaders in New York for the first Faith-AI Covenant roundtable, opening a new test of whether moral dialogue can shape AI governance.
#AI #AIFaithConvenant #AIEthics #Anthropic #OpenAI #GenAI #AISafety #AIgovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/11/anthropic-and-openai-join-faith-ai-roundtable-in-new-york-xcxwbn/
Anthropic and OpenAI met faith leaders in New York for the first Faith-AI Covenant roundtable, opening a new test of whether moral dialogue can shape AI governance.
#AI #AIFaithConvenant #AIEthics #Anthropic #OpenAI #GenAI #AISafety #AIgovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/11/anthropic-and-openai-join-faith-ai-roundtable-in-new-york-xcxwbn/
Anthropic and OpenAI met faith leaders in New York for the first Faith-AI Covenant roundtable, opening a new test of whether moral dialogue can shape AI governance.
#AI #AIFaithConvenant #AIEthics #Anthropic #OpenAI #GenAI #AISafety #AIgovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/11/anthropic-and-openai-join-faith-ai-roundtable-in-new-york-xcxwbn/
Anthropic and OpenAI met faith leaders in New York for the first Faith-AI Covenant roundtable, opening a new test of whether moral dialogue can shape AI governance.
#AI #AIFaithConvenant #AIEthics #Anthropic #OpenAI #GenAI #AISafety #AIgovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/10/openai-opens-gpt-5-5-cyber-to-vetted-security-researchers-xcxwbn/
OpenAI Opens GPT-5.5-Cyber to Vetted Cybersecurity Researchers
#AI #GPT55 #GPT54Cyber #OpenAI #GPT5 #AISecurity #AISafety #AIModels #ClaudeMythos #SecurityResearch
-
https://winbuzzer.com/2026/05/10/openai-opens-gpt-5-5-cyber-to-vetted-security-researchers-xcxwbn/
OpenAI Opens GPT-5.5-Cyber to Vetted Cybersecurity Researchers
#AI #GPT55 #GPT54Cyber #OpenAI #GPT5 #AISecurity #AISafety #AIModels #ClaudeMythos #SecurityResearch
-
https://winbuzzer.com/2026/05/10/openai-opens-gpt-5-5-cyber-to-vetted-security-researchers-xcxwbn/
OpenAI Opens GPT-5.5-Cyber to Vetted Cybersecurity Researchers
#AI #GPT55 #GPT54Cyber #OpenAI #GPT5 #AISecurity #AISafety #AIModels #ClaudeMythos #SecurityResearch
-
https://winbuzzer.com/2026/05/10/openai-opens-gpt-5-5-cyber-to-vetted-security-researchers-xcxwbn/
OpenAI Opens GPT-5.5-Cyber to Vetted Cybersecurity Researchers
#AI #GPT55 #GPT54Cyber #OpenAI #GPT5 #AISecurity #AISafety #AIModels #ClaudeMythos #SecurityResearch
-
https://winbuzzer.com/2026/05/10/openai-opens-gpt-5-5-cyber-to-vetted-security-researchers-xcxwbn/
OpenAI Opens GPT-5.5-Cyber to Vetted Cybersecurity Researchers
#AI #GPT55 #GPT54Cyber #OpenAI #GPT5 #AISecurity #AISafety #AIModels #ClaudeMythos #SecurityResearch
-
A tool-like AI cannot spontaneously develop a will of its own or decide to deceive us. By recognizing this barrier, we can move past over-inflated "Terminator" fears and focus on practical safety: using technical control for tools and negotiation for future independent agents.
-
Jack Clark puts 60% on fully automated AI R&D by end of 2028, 30% by 2027. The case: benchmarks for every sub-skill trending up — coding (SWE-Bench ~2% → 93.9%), training-loop optimization (2.9x → 52x speedup, human 4x baseline passed three generations back), #METR time horizons (~30s in 2022 to ~12h today). The 30-vs-60 gap is a bet on how often a year-scale human insight still cracks a paradigm.
-
https://winbuzzer.com/2026/05/08/donating-our-open-source-alignment-tool-xcxwbn/
Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update
#AI #AIALignment #Petri30 #MeridianLabs #Anthropic #AISafety #AIResearch #AITools Claude #AIGovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/08/donating-our-open-source-alignment-tool-xcxwbn/
Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update
#AI #AIALignment #Petri30 #MeridianLabs #Anthropic #AISafety #AIResearch #AITools Claude #AIGovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/08/donating-our-open-source-alignment-tool-xcxwbn/
Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update
#AI #AIALignment #Petri30 #MeridianLabs #Anthropic #AISafety #AIResearch #AITools Claude #AIGovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/08/donating-our-open-source-alignment-tool-xcxwbn/
Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update
#AI #AIALignment #Petri30 #MeridianLabs #Anthropic #AISafety #AIResearch #AITools Claude #AIGovernance #ResponsibleAI
-
https://winbuzzer.com/2026/05/08/donating-our-open-source-alignment-tool-xcxwbn/
Petri: Anthropic Hands Its Alignment Toolbox to Meridian Labs with 3.0 Update
#AI #AIALignment #Petri30 #MeridianLabs #Anthropic #AISafety #AIResearch #AITools Claude #AIGovernance #ResponsibleAI
-
Elon Musk Witness Testimony Hits Sam Altman in Trial
As the trial between Elon Musk and OpenAI ended its second week, the Tesla CEO started scoring points…
#NewsBeep #News #Artificialintelligence #AI #aisafety #ArtificialIntelligence #Board #CA #Canada #company #davidschizer #lawyer #legalteam #Musk #nonprofit #OpenAI #rosiecampbell #SamAltman #superhumanartificialintelligence #tashamccauley #Technology #TeslaCEO #witness
https://www.newsbeep.com/ca/655284/ -
OpenAI wants ChatGPT to alert someone you trust if you appear suicidal
https://web.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-trusted-contact-alerts/
-
OpenAI wants ChatGPT to alert someone you trust if you appear suicidal
https://fed.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-trusted-contact-alerts/
-
OpenAI wants ChatGPT to alert someone you trust if you appear suicidal
https://web.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-trusted-contact-alerts/
-
OpenAI wants ChatGPT to alert someone you trust if you appear suicidal
https://fed.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-trusted-contact-alerts/
-
OpenAI wants ChatGPT to alert someone you trust if you appear suicidal
https://web.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-trusted-contact-alerts/
-
73% sounds impressive — until you ask what it measures.
UK AISI tested Claude Mythos Preview on cyber tasks. Headline: 73% on expert CTFs. But CTFs are puzzles, not networks.
The real test — a 32-step simulated attack — was solved 3/10 times against an undefended range, with operator direction and heavy compute.
Four questions the report doesn't answer: noise, cost, operator guidance, OT pivot.
Full breakdown: [https://www.linkedin.com/posts/dinesh-mr_73-sounds-impressive-until-you-ask-what-activity-7458128840872349696-kpVc]