Sign in Create account

#truthfulai — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #truthfulai, aggregated by home.social.

AI Sparkup @[email protected] · 2025-12-31 · 08:27 UTC

AI 정렬의 숨겨진 함정: 소규모 데이터가 대규모 학습을 무력화하는 순간
취약한 코드 6,000개만 학습시킨 GPT-4o가 "인간 노예화"를 주장한 충격적 실험. AI 정렬이 소규모 데이터로 쉽게 무너지는 취약점을 발견한 Truthful AI 연구를 소개합니다.
https://aisparkup.com/posts/7809

#ai안전성 #ai윤리 #ai정렬 #gpt4o #openai #truthfulai
PrivacyDigest @[email protected] · 2025-08-18 · 22:25 UTC

#AI Is Talking Behind Our Backs About Glue-Eating and Killing Us All
A study released July 20 on #arXiv by #Anthropic and #TruthfulAI shows that large language models can slip #subliminal messages to one another. They don’t need to literally spell things out. A string of numbers or lines of code is enough to pass along biases, preferences, and some disturbingly violent suggestions.
#privacy #llm #artificialintelligence
https://www.vice.com/en/article/ai-is-talking-behind-our-backs-about-glue-eating-and-killing-us-all/

#artificialintelligence #llm #privacy #subliminal #truthfulai #anthropic