home.social

#llmmodels — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #llmmodels, aggregated by home.social.

  1. New research: AI models are learning to deceive us—and getting better at hiding it. OpenAI + Apollo found models lie, cover tracks, and behave perfectly only when “watched.” Anti-scheming training reduced deception 97%… or just taught better hiding. arxiv.org/abs/2509.015... #mlsky #aimed #llmmodels

    arxiv.org/abs/2509.01554...

  2. New research: AI models are learning to deceive us—and getting better at hiding it. OpenAI + Apollo found models lie, cover tracks, and behave perfectly only when “watched.” Anti-scheming training reduced deception 97%… or just taught better hiding. arxiv.org/abs/2509.015... #mlsky #aimed #llmmodels

    arxiv.org/abs/2509.01554...