PulseAugur
LIVE 13:08:53
research · [1 source] ·
0
research

New OMHBench evaluates omni-modal reasoning, finds speech modality lacking

Researchers have introduced OMHBench, a new benchmark designed to evaluate the multi-hop reasoning capabilities of omni-modal large language models (MLLMs). This benchmark features 6,144 questions with balanced reasoning paths across text, vision, and speech modalities, aiming to overcome limitations of existing frameworks such as modality shortcuts. Evaluations using OMHBench revealed a significant performance disparity between proprietary and open-source MLLMs, with even leading proprietary models showing sensitivity to reasoning path variations and struggling particularly with speech processing. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a new evaluation standard for omni-modal LLMs, highlighting current model weaknesses in speech processing and balanced reasoning.

RANK_REASON The cluster describes a new academic paper introducing a novel benchmark for evaluating multimodal large language models.

Read on arXiv cs.CL →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 · Seunghee Kim, Ingyu Bang, Seokgyu Jang, Changhyeon Kim, Sanghwan Bae, Jihun Choi, Richeng Xuan, Taeuk Kim ·

    OMHBench: Benchmarking Balanced and Grounded Omni-Modal Multi-Hop Reasoning

    arXiv:2508.16198v3 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) have increasingly supported omni-modal processing across text, vision, and speech. However, existing evaluation frameworks for such models suffer from critical limitations, including moda…