PulseAugur
实时 07:41:19
English(EN) Adaptive Turn-Taking for Real-time Multi-Party Voice Agents

新型语音代理改进多方对话中的轮流机制

研究人员开发了ModeratorLM,这是一种新颖的语音代理,旨在改进实时多方对话中的轮流机制。该系统利用语音大型语言模型,并为代理分配特定角色来管理对话流程,尤其是在竞争性发言者动态环境中。一个增强推理的版本结合了思维链处理,以增强上下文理解。实验表明,在轮流准确率和召回率方面有了显著提高,同时减少了打断。 AI

影响 增强了AI代理在群组对话中自然参与的能力,有望改善协作式AI应用中的用户体验。

排序理由 该集群包含一篇详细介绍用于改进语音代理轮流机制的新模型和数据集的研究论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Soumyajit Mitra, Prabhat Pandey, Abhinav Jain, Shanmukha Sahith, K V Vijay Girish ·

    Adaptive Turn-Taking for Real-time Multi-Party Voice Agents

    arXiv:2606.13544v1 Announce Type: cross Abstract: Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor competition and varying user expectations. We propose ModeratorLM, a role-playing voice agent…

  2. arXiv cs.AI TIER_1 English(EN) · K V Vijay Girish ·

    Adaptive Turn-Taking for Real-time Multi-Party Voice Agents

    Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor competition and varying user expectations. We propose ModeratorLM, a role-playing voice agent that conditions turn-taking behavior on an explic…