MapTab: Are MLLMs Ready for Multi-Criteria Route Planning in Heterogeneous Graphs?
Researchers have introduced MapTab, a new benchmark designed to evaluate the multi-criteria reasoning abilities of multimodal large language models (MLLMs). This benchmark utilizes route planning tasks that combine visual map data with structured tabular information on criteria such as time and price. MapTab includes two scenarios, Metromap and Travelmap, featuring extensive datasets of maps, queries, and questions to challenge MLLMs. Initial evaluations indicate that current MLLMs struggle with these complex multimodal reasoning tasks, sometimes underperforming unimodal approaches when visual perception is limited. AI
IMPACT Establishes a new evaluation standard for multimodal LLMs, pushing for more robust reasoning capabilities beyond current benchmarks.