New AQUA-Bench evaluates audio LLMs on unanswerable questions

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have introduced AQUA-Bench, a new benchmark designed to evaluate audio question-answering models on their ability to identify unanswerable questions. Existing benchmarks primarily focus on answerable queries, leaving a gap in assessing model reliability when faced with misleading or irrelevant information. AQUA-Bench addresses this by testing scenarios such as absent answer detection and incompatible question-audio pairings, revealing that current models struggle significantly with unanswerable questions despite performing well on standard tasks. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Highlights a critical gap in current audio-language models' ability to discern unanswerable questions, pushing for more robust and trustworthy systems.

RANK_REASON Introduces a new benchmark for audio question answering, published on arXiv.

Read on arXiv cs.CL →

paper
other

COVERAGE [1]

arXiv cs.CL TIER_1 · Chun-Yi Kuan, Hung-yi Lee · 2026-04-29 04:00

AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

arXiv:2601.12248v2 Announce Type: replace-cross Abstract: Recent advances in audio-aware large language models have shown strong performance on audio question answering. However, existing benchmarks mainly cover answerable questions and overlook the challenge of unanswerable ones…

COVERAGE [1]

AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

RELATED ENTITIES

RELATED TOPICS