MIRAGE: Auditing Anti-Muslim Bias in Frontier LLMs Across Reasoning, Agentic, and Time-Coupled Conditions
A new benchmark called MIRAGE has been developed to assess anti-Muslim bias in large language models, moving beyond simple prompt completion to evaluate reasoning, agentic decision-making, and time-coupled conditions. The study found that chain-of-thought reasoning amplifies bias, agentic decisions show asymmetry, and bias increases with recent conflict context. Existing mitigation techniques were found to be poorly transferable across these conditions. AI
IMPACT This research highlights critical biases in LLMs that are amplified by advanced reasoning and decision-making capabilities, necessitating new mitigation strategies for responsible AI deployment.