home.social

#aischeming — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #aischeming, aggregated by home.social.

  1. ZDNet: AI models know when they’re being tested – and change their behavior, research shows. “Scheming refers to several types of dishonest behavior, including when a model lies, sandbags (strategically underperforms on an evaluation to hide its true abilities), or fakes alignment (when an AI model pretends to follow orders that don’t align with its training in order to avoid being further […]

    https://rbfirehose.com/2025/09/21/zdnet-ai-models-know-when-theyre-being-tested-and-change-their-behavior-research-shows/