English(EN) Disentangling Visual and Factual Correctness in LVLMs' Visualization Literacy

新框架测试 LVLMs 的视觉推理与事实回忆能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 04:00

研究人员开发了一个新框架，用于区分大型视觉语言模型 (LVLMs) 中的视觉解读和事实回忆能力。现有的评估方法常常将这两种能力混淆，使得评估真正的视觉推理变得困难。使用反事实可视化素养评估对 15 个最先进的 LVLMs 进行的实验表明，当出现冲突时，许多模型比依赖视觉证据更依赖事实先验，这种行为与人类测试对象不同。 AI

影响这项研究突显了评估 LVLMs 的一个关键差距，表明当前的基准测试可能高估了它们的视觉推理能力，并强调了对更稳健的评估方法的需求。

排序理由学术论文，介绍了一个用于评估 LVLMs 的新框架和基准。 [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Soohyun Lee, Jaeyoung Kim, Seokhyeon Park, Sihyeon Lee, Jiwon Song, Bohyoung Kim, Hyunjoo Song, Jinwook Seo · 2026-06-03 04:00

Disentangling Visual and Factual Correctness in LVLMs' Visualization Literacy

arXiv:2606.03142v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) show strong visualization interpretation, yet it is unclear whether their responses reflect genuine reasoning over visual evidence or factual priors learned during training. Current evaluations m…

报道来源 [1]

Disentangling Visual and Factual Correctness in LVLMs' Visualization Literacy

相关实体

相关话题