PulseAugur
LIVE 14:47:43
research · [1 source] ·
0
research

New AQUA-Bench evaluates audio LLMs on unanswerable questions

Researchers have introduced AQUA-Bench, a new benchmark designed to evaluate audio question-answering models on their ability to identify unanswerable questions. Existing benchmarks primarily focus on answerable queries, leaving a gap in assessing model reliability when faced with misleading or irrelevant information. AQUA-Bench addresses this by testing scenarios such as absent answer detection and incompatible question-audio pairings, revealing that current models struggle significantly with unanswerable questions despite performing well on standard tasks. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Highlights a critical gap in current audio-language models' ability to discern unanswerable questions, pushing for more robust and trustworthy systems.

RANK_REASON Introduces a new benchmark for audio question answering, published on arXiv.

Read on arXiv cs.CL →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 · Chun-Yi Kuan, Hung-yi Lee ·

    AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

    arXiv:2601.12248v2 Announce Type: replace-cross Abstract: Recent advances in audio-aware large language models have shown strong performance on audio question answering. However, existing benchmarks mainly cover answerable questions and overlook the challenge of unanswerable ones…