Researchers have introduced AudioDER, a new dataset designed to enhance the reasoning capabilities of Large Audio-Language Models (LALMs). The dataset addresses the issue of redundancy in existing audio-language datasets by employing a deduplication process to improve diversity. AudioDER contains approximately 191,000 samples, each including an audio clip, a multiple-choice question, answer candidates, an audio caption, and a chain-of-thought rationale generated by Qwen3-30B. Experiments demonstrate that post-training LALMs like Qwen2-Audio-7B-Instruct on AudioDER leads to consistent performance improvements on various audio reasoning benchmarks. AI
IMPACT This dataset could accelerate progress in audio reasoning for LALMs, leading to more sophisticated audio understanding applications.
RANK_REASON The cluster describes a new academic dataset and research paper focused on improving AI models.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →