Study finds no self-preference bias in LLM text revision

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-18 11:12

A new study published on arXiv investigated whether large language models exhibit self-preference when revising their own text, particularly when presented with valid corrections. The research tested four mid-tier model families using the IFEval benchmark, where a deterministic verifier confirmed both the violation of a constraint and the validity of a proposed edit. Across 85 comparisons, the study found no significant difference in rejection rates between models acting as the original author versus a neutral judge, indicating a lack of self-preference bias. When authors did reject a valid fix, their stated reasons were overwhelmingly related to catching flaws in the proposed edit rather than personal preference. AI

影响 This research suggests that current LLMs may not exhibit self-preference when revising their own text, potentially simplifying their use in automated content generation and editing workflows.

排序理由 Academic paper published on arXiv detailing a specific research finding about LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Study finds no self-preference bias in LLM text revision

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Pierrick Bougault · 2026-06-18 11:12

可验证指令遵循修订中自我偏好较弱或缺失：真实作者身份下的四模型测试

Large language models (LLMs) increasingly review and revise text, including their own. A documented self-preference bias (models favoring their own generations when acting as judges) raises the question of whether models also resist valid corrections to their own writing. We test…

报道来源 [1]

可验证指令遵循修订中自我偏好较弱或缺失：真实作者身份下的四模型测试

相关实体

相关话题