English(EN) From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation

大型语言模型难以模拟对仇恨言论的不同人口观点

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-04 15:09

一项新的研究论文探讨了使用基于角色的条件化大型语言模型来模拟不同人口视角以进行仇恨言论标注的有效性。研究发现，当前模型未能持续捕捉到人类群体间的差异、群体内的敏感性或对其他群体反应的间接预测。然而，使用 Llama 3.1 进行间接提示在近似人类分歧模式方面显示出最大潜力。 AI

影响在仇恨言论检测等细微任务中，大型语言模型可能无法可靠地取代多样化的人类标注者。

排序理由该集群包含一篇详细介绍大型语言模型能力研究结果的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Paloma Piot, Javier Parapar · 2026-06-05 04:00

从自我到他人：评估大型语言模型仇恨言论标注中的人口统计学视角采择

arXiv:2606.06266v1 Announce Type: new Abstract: Hate speech detection is inherently subjective: people from different demographic groups perceive the same content very differently. Collecting enough annotations from multiple demographic groups is costly and difficult to scale. Pe…
arXiv cs.CL TIER_1 English(EN) · Javier Parapar · 2026-06-04 15:09

从自我到他人：评估大型语言模型仇恨言论标注中的人口统计学视角采择

Hate speech detection is inherently subjective: people from different demographic groups perceive the same content very differently. Collecting enough annotations from multiple demographic groups is costly and difficult to scale. Persona-conditioned Large Language Models (models …