InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information
Researchers have developed new frameworks and benchmarks to improve how multimodal large language models (MLLMs) reason across complex visual data, such as charts. One approach, HierVA, uses a hierarchical agent to manage context in a joint image-text space, distinguishing between high-level planning and specialized reasoning workers. Another model, Chart-FR1, employs a focus-driven chain-of-thought process to enhance perception and adaptive reasoning on charts with high information density. To evaluate these advancements, new benchmarks like InterChart and HID-Chart are being introduced to specifically test MLLMs' capabilities in understanding and reasoning over multiple, dense, or distributed chart information. AI
IMPACT Advances in chart reasoning for MLLMs could improve data analysis and interpretation in fields like finance and scientific reporting.