Audio2Tool: Bridging Spoken Language Understanding and Function Calling
Researchers have introduced Audio2Tool, a new benchmark dataset designed to evaluate the function-calling capabilities of spoken language models. The dataset includes approximately 30,000 queries across smart car, smart home, and wearable domains, featuring a complexity hierarchy from simple commands to multi-intent requests. Evaluations of current state-of-the-art models revealed significant performance degradation when faced with compositional challenges and acoustic variations, highlighting areas for future improvement. AI
IMPACT Introduces a new benchmark to better evaluate spoken language models' ability to call tools, potentially driving improvements in voice assistant capabilities.