English(EN) OmniSch: A Multimodal PCB Schematic Benchmark For Structured Diagram Visual Reasoning

新基准DRAGON和OmniSch测试LMM的图表推理能力

作者 PulseAugur 编辑部 · [4 个来源] · 2026-04-28 04:00

研究人员推出了DRAGON，这是一个旨在评估视觉语言模型（VLM）在多大程度上能够将其推理与图表中的特定视觉证据联系起来的新基准。该基准解决了模型可能通过虚假关联而非真正理解视觉信息而获得正确答案的局限性。DRAGON包含来自六个现有图表问答数据集的超过11,000个带注释的问题实例，其中测试集包含经过人类验证的推理证据注释。评估了八个VLM在各种图表类型中定位这些证据的能力，旨在提高基于图表的推理的可解释性和可靠性。 AI

影响改进了对图表视觉推理的评估，推动了更具可解释性和可靠性的AI系统。

排序理由这是一篇介绍用于评估AI模型的新基准的研究论文。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

报道来源 [4]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-04-28 05:24

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

Diagram question answering (DQA) requires models to interpret structured visual representations such as charts, maps, infographics, circuit schematics, and scientific diagrams. Recent vision-language models (VLMs) often achieve high answer accuracy on these tasks, yet correct ans…
arXiv cs.CV TIER_1 English(EN) · Anirudh Iyengar Kaniyar Narayana Iyengar, Tampu Ravi Kumar, Gaurav Najpande, Manan Suri, Dinesh Manocha, Puneet Mathur, Vivek Gupta · 2026-04-29 04:00

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

arXiv:2604.25231v1 Announce Type: new Abstract: Diagram question answering (DQA) requires models to interpret structured visual representations such as charts, maps, infographics, circuit schematics, and scientific diagrams. Recent vision-language models (VLMs) often achieve high…
arXiv cs.CV TIER_1 English(EN) · Vivek Gupta · 2026-04-28 05:24

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

Diagram question answering (DQA) requires models to interpret structured visual representations such as charts, maps, infographics, circuit schematics, and scientific diagrams. Recent vision-language models (VLMs) often achieve high answer accuracy on these tasks, yet correct ans…
arXiv cs.CV TIER_1 English(EN) · Taiting Lu, Kaiyuan Lin, Yuxin Tian, Mingjia Wang, Yubo Wang, Muchuan Wang, Sharique Khatri, Akshit Kartik, Yixi Wang, Amey Santosh Rane, Yida Wang, Sung-Liang Chen, Yifan Yang, Yi-Chao Chen, Yincheng Jin, Mahanth Gowda · 2026-04-28 04:00

OmniSch: A Multimodal PCB Schematic Benchmark For Structured Diagram Visual Reasoning

arXiv:2604.00270v2 Announce Type: replace Abstract: Recent large multimodal models (LMMs) have made rapid progress in visual grounding, document understanding, and diagram reasoning tasks. However, their ability to convert Printed Circuit Board (PCB) schematic diagrams into machi…

报道来源 [4]

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

OmniSch: A Multimodal PCB Schematic Benchmark For Structured Diagram Visual Reasoning

相关实体

相关话题