PulseAugur
实时 09:37:39
English(EN) InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information

新AI模型应对复杂的图表推理和生成挑战

研究人员开发了新的框架和基准测试,以改进多模态大语言模型(MLLMs)在复杂视觉数据(如图表)上的推理能力。一种方法HierVA使用分层代理在联合图像-文本空间中管理上下文,区分高级规划和专业推理工作者。另一个模型Chart-FR1采用聚焦驱动的思维链过程,以增强对高信息密度图表的感知和自适应推理能力。为了评估这些进展,正在引入InterChart和HID-Chart等新基准测试,以专门测试MLLM在理解和推理多个、密集或分布式图表信息方面的能力。 AI

影响 MLLM在图表推理方面的进步可以改善金融和科学报告等领域的數據分析和解读。

排序理由 多篇研究论文介绍了用于图表多模态推理的新模型和基准测试。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。 我们如何撰写摘要 →

新AI模型应对复杂的图表推理和生成挑战

报道来源 [5]

  1. arXiv cs.CL TIER_1 English(EN) · Qihua Dong, Ruozhen He, Junwen Chen, Yizhou Wang, Xu Ma, Songyao Jiang, Yun Fu ·

    Hierarchical Visual Agent: Managing Contexts in Joint Image-Text Space for Advanced Chart Reasoning

    arXiv:2605.04304v1 Announce Type: cross Abstract: Advanced chart question answering requires both precise perception of small visual elements and multi-step reasoning across several subplots. While existing MLLMs are strong at understanding single plots, they often struggle with …

  2. arXiv cs.CL TIER_1 English(EN) · Yun Fu ·

    Hierarchical Visual Agent: Managing Contexts in Joint Image-Text Space for Advanced Chart Reasoning

    Advanced chart question answering requires both precise perception of small visual elements and multi-step reasoning across several subplots. While existing MLLMs are strong at understanding single plots, they often struggle with multi-step reasoning across multiple subplots. We …

  3. arXiv cs.CV TIER_1 English(EN) · Shishi Xiao, Tongyu Zhou, David Laidlaw, Gromit Yeuk-Yin Chan ·

    ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control

    arXiv:2603.14209v2 Announce Type: replace Abstract: A pictorial chart is an effective medium for visual storytelling, seamlessly integrating visual elements with data charts. However, creating such images is challenging because the flexibility of visual elements often conflicts w…

  4. arXiv cs.CV TIER_1 English(EN) · Hongkun Pan, Yuwei Wu, Wanyi Hong, Shenghui Hu, Qitong Yan, Yi Yang, Rufei Han, Changju Zhou, Minfeng Zhu, Dongming Han, Wei Chen ·

    Chart-FR1: Visual Focus-Driven Fine-Grained Reasoning on Dense Charts

    arXiv:2605.01882v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) have shown considerable potential in chart understanding and reasoning tasks. However, they still struggle with high information density (HID) charts characterized by multiple subplots, legen…

  5. arXiv cs.CV TIER_1 English(EN) · Anirudh Iyengar Kaniyar Narayana Iyengar, Srija Mukhopadhyay, Adnan Qidwai, Shubhankar Singh, Dan Roth, Vivek Gupta ·

    InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information

    arXiv:2508.07630v2 Announce Type: replace-cross Abstract: We introduce InterChart, a diagnostic benchmark that evaluates how well vision-language models (VLMs) reason across multiple related charts, a task central to real-world applications such as scientific reporting, financial…