Leading AI models such as GPT and Gemini frequently provide correct answers while citing non-existent or irrelevant evidence. This phenomenon, termed "attribution hallucination" by researchers at Peking University, poses a significant risk in critical sectors like law and medicine. To address this, a new benchmark called CiteVQA has been developed to systematically evaluate and identify these citation errors. AI
IMPACT New benchmark CiteVQA highlights attribution hallucination in AI models, posing risks for regulated industries and prompting development of more reliable citation methods.
RANK_REASON The cluster describes a new academic benchmark for evaluating AI model behavior. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →