New OMHBench evaluates omni-modal reasoning, finds speech modality lacking

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have introduced OMHBench, a new benchmark designed to evaluate the multi-hop reasoning capabilities of omni-modal large language models (MLLMs). This benchmark features 6,144 questions with balanced reasoning paths across text, vision, and speech modalities, aiming to overcome limitations of existing frameworks such as modality shortcuts. Evaluations using OMHBench revealed a significant performance disparity between proprietary and open-source MLLMs, with even leading proprietary models showing sensitivity to reasoning path variations and struggling particularly with speech processing. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a new evaluation standard for omni-modal LLMs, highlighting current model weaknesses in speech processing and balanced reasoning.

RANK_REASON The cluster describes a new academic paper introducing a novel benchmark for evaluating multimodal large language models.

Read on arXiv cs.CL →

paper
other

COVERAGE [1]

arXiv cs.CL TIER_1 · Seunghee Kim, Ingyu Bang, Seokgyu Jang, Changhyeon Kim, Sanghwan Bae, Jihun Choi, Richeng Xuan, Taeuk Kim · 2026-04-29 04:00

OMHBench: Benchmarking Balanced and Grounded Omni-Modal Multi-Hop Reasoning

arXiv:2508.16198v3 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) have increasingly supported omni-modal processing across text, vision, and speech. However, existing evaluation frameworks for such models suffer from critical limitations, including moda…

COVERAGE [1]

OMHBench: Benchmarking Balanced and Grounded Omni-Modal Multi-Hop Reasoning

RELATED ENTITIES

RELATED TOPICS