Leading AI models such as GPT and Gemini frequently provide correct answers while citing non-existent or irrelevant evidence. This phenomenon, termed "attribution hallucination" by researchers at Peking University, poses a significant risk in critical sectors like law and medicine. To address this, a new benchmark called CiteVQA has been developed to systematically evaluate and identify these citation errors. AI
Summary written by gemini-2.5-flash-lite from 1 sources. How we write summaries →
IMPACT New benchmark CiteVQA highlights attribution hallucination in AI models, posing risks for regulated industries and prompting development of more reliable citation methods.
RANK_REASON The cluster describes a new academic benchmark for evaluating AI model behavior. [lever_c_demoted from research: ic=1 ai=1.0]