English(EN) Thinking While Speaking: Inference-Time Knowledge Transfer for Responsive and Intelligent Conversational Voice Agents

新技术弥合语音代理的延迟-能力差距

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-24 04:00

研究人员开发了一种名为“对话填充”的新技术，以解决语音代理中的延迟-能力权衡问题。该方法使用一个小型、快速的“说话者”模型在推理过程中生成即时响应，同时整合来自一个更大、更慢的“推理者”模型的知识。创建了一个包含超过 290,000 个示例的合成数据集来训练七个小型语言模型，证明该方法可以在保持高精度的同时显著缩短响应时间。用户研究表明，采用对话填充的代理在感知能力上与前沿模型相当，并且响应更迅速，尤其是在检索密集型任务中。 AI

影响使语音代理既能快速响应又能胜任，从而改善复杂对话任务的用户体验。

排序理由该集群包含一篇详细介绍对话式人工智能新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Vidya Srinivas, Zachary Englhardt, Shwetak Patel, Vikram Iyer · 2026-06-24 04:00

Thinking While Speaking: Inference-Time Knowledge Transfer for Responsive and Intelligent Conversational Voice Agents

arXiv:2511.07397v2 Announce Type: replace Abstract: Voice agents face a fundamental tension: the reasoning, retrieval, and tool use that make foundation models capable are iterative and slow, while conversational interaction demands responses on a millisecond timescale. Smaller, …

报道来源 [1]

Thinking While Speaking: Inference-Time Knowledge Transfer for Responsive and Intelligent Conversational Voice Agents

相关实体

相关话题