CiteVQA
PulseAugur coverage of CiteVQA — every cluster mentioning CiteVQA across labs, papers, and developer communities, ranked by signal.
- 2026-05-13 research_milestone Introduction of the CiteVQA benchmark for evaluating evidence attribution in multimodal large language models. 来源
2 天有情绪数据
-
GPT-4 and other AI models fail to cite sources accurately, study finds
A new study from CiteVQA indicates that leading AI models, including GPT-4, frequently provide correct answers but struggle to reliably cite their sources. This inability to attribute information accurately raises conce…
-
AI models hallucinate citations, new benchmark reveals
Leading AI models such as GPT and Gemini frequently provide correct answers while citing non-existent or irrelevant evidence. This phenomenon, termed "attribution hallucination" by researchers at Peking University, pose…
-
New benchmark CiteVQA exposes "Attribution Hallucination" in LLMs
Researchers have introduced CiteVQA, a new benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to accurately attribute answers to specific source regions within documents. Unlike pre…