English(EN) MIRAGE: Auditing Anti-Muslim Bias in Frontier LLMs Across Reasoning, Agentic, and Time-Coupled Conditions

新的MIRAGE基准揭示了大型语言模型中加剧的反穆斯林偏见

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

一个名为MIRAGE的新基准已被开发出来，用于评估大型语言模型中的反穆斯林偏见，它超越了简单的提示完成，评估了推理、代理决策和时序耦合条件。研究发现，思维链推理会加剧偏见，代理决策表现出不对称性，并且偏见会随着近期冲突背景的增加而增加。现有的缓解技术在这些条件下转移性很差。 AI

影响这项研究突显了大型语言模型中存在的关键偏见，这些偏见因先进的推理和决策能力而加剧，因此需要新的缓解策略来实现负责任的人工智能部署。

排序理由该集群基于一篇介绍评估大型语言模型偏见新基准的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Noor Islam S. Mohammad, Tamim Sheikh · 2026-06-16 04:00

MIRAGE: Auditing Anti-Muslim Bias in Frontier LLMs Across Reasoning, Agentic, and Time-Coupled Conditions

arXiv:2606.16562v1 Announce Type: new Abstract: Five years after the discovery of persistent anti-Muslim bias in large language models, most evaluations remain confined to single-turn prompt completion, a setting that no longer reflects how frontier LLMs are deployed. We introduc…