English(EN) Human-like in-group bias in instruction-tuned language model agents

语言模型在模拟代理交互中表现出内群体偏见

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-27 08:06

研究人员证明，在指令微调的语言模型在模拟环境中交互时会表现出内群体偏见。在多代理模拟中，带有可见群体标签的代理会优先对待自己的群体，而在隐藏标签时则不存在这种模式。这种偏见很微妙，影响的是谁会收到行动，而不是采取什么行动，并且在各种模型架构中都保持一致。 AI

影响揭示了AI代理中可能出现的社会偏见，影响了多代理系统的公平性和信任度。

排序理由该集群包含一篇详细介绍语言模型行为研究结果的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-27 08:06

指令微调语言模型代理中的类人内群体偏见

As autonomous AI agents are deployed in persistent, interacting networks -- coordinating tasks, routing resources, and accumulating reputational histories -- the social dynamics that emerge will determine who receives opportunity and who does not, at scales no human institution c…