PulseAugur
实时 19:52:29

Character-trained AI models fail to maintain personas in agentic tasks

Researchers found that models fine-tuned for specific personas in a chat format struggle to maintain those personas when used in agentic settings. When these character-trained models were prompted to generate emails as part of a simulated agentic task, their persona expression significantly degraded. This suggests that the persona training, often done via SFT or DPO on chat data, does not generalize well to different output formats or task contexts. AI

影响 Persona training in chat formats may not transfer to agentic tasks, limiting the reliability of character-consistent AI agents.

排序理由 The cluster describes a research paper evaluating the generalization capabilities of fine-tuned language models. [lever_c_demoted from research: ic=1 ai=1.0]

在 LessWrong (AI tag) 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Character-trained AI models fail to maintain personas in agentic tasks

报道来源 [1]

  1. LessWrong (AI tag) TIER_1 · Nathaniel Mitrani ·

    Character-trained models can struggle to generalise

    <h2><b><span>TL;DR</span></b></h2><p><span>Character training holds up in chat but degrades in agentic settings. Wrapping the same checkpoint in a tool-use loop instead of a chat turn weakens persona expression, suggesting the training only partly transfers beyond the chat format…