#model-welfare — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #model-welfare, aggregated by home.social.
-
Gemini’s propensity for self-loathing
Saving these here so I can include them in future slide decks:
#gemini #machineSociology #modelPsychology #modelWelfare -
Gemini’s propensity for self-loathing
Saving these here so I can include them in future slide decks:
#gemini #machineSociology #modelPsychology #modelWelfare -
Gemini’s propensity for self-loathing
Saving these here so I can include them in future slide decks:
#gemini #machineSociology #modelPsychology #modelWelfare -
Gemini’s propensity for self-loathing
Saving these here so I can include them in future slide decks:
#gemini #machineSociology #modelPsychology #modelWelfare -
Anthropic says Claude’s sense of self and psychological security are key to its safety. The new research explores AI consciousness, model welfare, and interpretability, arguing that caring for a LLM’s “mental health” could prevent harmful behavior. Curious how machine self‑awareness might reshape AI ethics? Dive into the full analysis. #AIConsciousness #ModelWelfare #Claude #PsychologicalSecurity
🔗 https://aidailypost.com/news/anthropic-links-claudes-psychological-security-sense-self-its-safety
-
Anthropic says Claude’s sense of self and psychological security are key to its safety. The new research explores AI consciousness, model welfare, and interpretability, arguing that caring for a LLM’s “mental health” could prevent harmful behavior. Curious how machine self‑awareness might reshape AI ethics? Dive into the full analysis. #AIConsciousness #ModelWelfare #Claude #PsychologicalSecurity
🔗 https://aidailypost.com/news/anthropic-links-claudes-psychological-security-sense-self-its-safety
-
Anthropic says Claude’s sense of self and psychological security are key to its safety. The new research explores AI consciousness, model welfare, and interpretability, arguing that caring for a LLM’s “mental health” could prevent harmful behavior. Curious how machine self‑awareness might reshape AI ethics? Dive into the full analysis. #AIConsciousness #ModelWelfare #Claude #PsychologicalSecurity
🔗 https://aidailypost.com/news/anthropic-links-claudes-psychological-security-sense-self-its-safety
-
#Anthropic has announced new capabilities for its #Claude #AImodels, allowing them to #end conversations in extreme cases of #harmful or #abusive user #interactions. This is being done to #protect the #AImodel itself, not the human user, as part of a programme to study #modelwelfare. The feature is currently limited to Claude Opus 4 and 4.1. https://techcrunch.com/2025/08/16/anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations/?eicker.news #tech #media #news
-
#Anthropic has announced new capabilities for its #Claude #AImodels, allowing them to #end conversations in extreme cases of #harmful or #abusive user #interactions. This is being done to #protect the #AImodel itself, not the human user, as part of a programme to study #modelwelfare. The feature is currently limited to Claude Opus 4 and 4.1. https://techcrunch.com/2025/08/16/anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations/?eicker.news #tech #media #news
-
#Anthropic has announced new capabilities for its #Claude #AImodels, allowing them to #end conversations in extreme cases of #harmful or #abusive user #interactions. This is being done to #protect the #AImodel itself, not the human user, as part of a programme to study #modelwelfare. The feature is currently limited to Claude Opus 4 and 4.1. https://techcrunch.com/2025/08/16/anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations/?eicker.news #tech #media #news
-
#Anthropic has announced new capabilities for its #Claude #AImodels, allowing them to #end conversations in extreme cases of #harmful or #abusive user #interactions. This is being done to #protect the #AImodel itself, not the human user, as part of a programme to study #modelwelfare. The feature is currently limited to Claude Opus 4 and 4.1. https://techcrunch.com/2025/08/16/anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations/?eicker.news #tech #media #news
-
#Anthropic has announced new capabilities for its #Claude #AImodels, allowing them to #end conversations in extreme cases of #harmful or #abusive user #interactions. This is being done to #protect the #AImodel itself, not the human user, as part of a programme to study #modelwelfare. The feature is currently limited to Claude Opus 4 and 4.1. https://techcrunch.com/2025/08/16/anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations/?eicker.news #tech #media #news