home.social

#assistantaxis — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #assistantaxis, aggregated by home.social.

  1. #LLMs learn various #characterarchetypes during #pretraining. #Posttraining focuses on the “#Assistant#persona, but its stability is uncertain. Researchers mapped a “persona space” for LLMs, finding the “#AssistantAxis” aligns with helpful, professional archetypes. Monitoring and capping activations along this axis can prevent models from drifting into harmful personas, enhancing their stability and safety. anthropic.com/research/assista #AIagent #AI #ML #NLP #LLM #GenAI

  2. #LLMs learn various #characterarchetypes during #pretraining. #Posttraining focuses on the “#Assistant#persona, but its stability is uncertain. Researchers mapped a “persona space” for LLMs, finding the “#AssistantAxis” aligns with helpful, professional archetypes. Monitoring and capping activations along this axis can prevent models from drifting into harmful personas, enhancing their stability and safety. anthropic.com/research/assista #AIagent #AI #ML #NLP #LLM #GenAI

  3. #LLMs learn various #characterarchetypes during #pretraining. #Posttraining focuses on the “#Assistant#persona, but its stability is uncertain. Researchers mapped a “persona space” for LLMs, finding the “#AssistantAxis” aligns with helpful, professional archetypes. Monitoring and capping activations along this axis can prevent models from drifting into harmful personas, enhancing their stability and safety. anthropic.com/research/assista #AIagent #AI #ML #NLP #LLM #GenAI

  4. #LLMs learn various #characterarchetypes during #pretraining. #Posttraining focuses on the “#Assistant#persona, but its stability is uncertain. Researchers mapped a “persona space” for LLMs, finding the “#AssistantAxis” aligns with helpful, professional archetypes. Monitoring and capping activations along this axis can prevent models from drifting into harmful personas, enhancing their stability and safety. anthropic.com/research/assista #AIagent #AI #ML #NLP #LLM #GenAI

  5. #LLMs learn various #characterarchetypes during #pretraining. #Posttraining focuses on the “#Assistant#persona, but its stability is uncertain. Researchers mapped a “persona space” for LLMs, finding the “#AssistantAxis” aligns with helpful, professional archetypes. Monitoring and capping activations along this axis can prevent models from drifting into harmful personas, enhancing their stability and safety. anthropic.com/research/assista #AIagent #AI #ML #NLP #LLM #GenAI