PulseAugur / Brief
EN
LIVE 15:07:38

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. IHBench: Evaluating Post-Interruption Recovery in Voice Agents with Structured Workflows

    A new benchmark called IHBench has been developed to evaluate how well voice agents recover from user interruptions within structured workflows. The benchmark assesses task fulfillment and recovery quality across ten enterprise domains and six interruption types. Evaluations of 27 audio-language model configurations revealed that closed-weight models, such as those from OpenAI and Google, generally outperform open-weight models in handling interruptions, degrading more slowly over longer conversations and showing no modality gap. AI

    IHBench: Evaluating Post-Interruption Recovery in Voice Agents with Structured Workflows

    IMPACT This benchmark could drive improvements in the robustness and usability of voice agents in enterprise settings.