PulseAugur
实时 15:44:19

BayLing-Duplex 使单个 LLM 能够实现原生全双工语音对话

研究人员开发了 BayLing-Duplex,这是一种新颖的全双工语音语言模型,无需依赖外部轮流模块即可实现同时听和说。这个单一的自回归 LLM 可以处理自然的对话现象,如打断和犹豫。通过一个适度的数据集进行微调,BayLing-Duplex 在轮流和打断处理方面表现出很高的成功率,同时与基于轮流的模型相比,保持或提高了响应质量。 AI

影响 这项研究通过实现真正的实时、同步语音交互,有可能加速更自然、响应更快的对话式 AI 代理的开发。

排序理由 该集群包含一篇详细介绍新模型架构和实验结果的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

BayLing-Duplex 使单个 LLM 能够实现原生全双工语音对话

报道来源 [3]

  1. arXiv cs.CL TIER_1 English(EN) · Wenqian Cui, Lei Zhu, Xiaohui Li, Zhihan Guo, Haoli Bai, Lu Hou, Irwin King ·

    TurnGuide:通过动态回合级文本-语音交错增强有意义的全双工口语交互

    arXiv:2508.07375v3 Announce Type: replace Abstract: Full-Duplex Speech Language Models (FD-SLMs) are specialized foundation models designed to enable natural, real-time spoken interactions by modeling complex conversational turn-taking such as interruptions, backchannels, and ove…

  2. arXiv cs.CL TIER_1 English(EN) · Qingkai Fang, Shoutao Guo, Yang Feng ·

    BayLing-Duplex: Native Full-Duplex Speech Dialogue with a Single Autoregressive LLM

    arXiv:2606.14528v1 Announce Type: new Abstract: Real-time, full-duplex speech interaction is a key feature of next-generation spoken chatbots, allowing the model to listen and speak at the same time and to handle natural phenomena such as overlap, hesitation, and barge-in. Existi…

  3. arXiv cs.CL TIER_1 English(EN) · Yang Feng ·

    BayLing-Duplex: Native Full-Duplex Speech Dialogue with a Single Autoregressive LLM

    Real-time, full-duplex speech interaction is a key feature of next-generation spoken chatbots, allowing the model to listen and speak at the same time and to handle natural phenomena such as overlap, hesitation, and barge-in. Existing speech language models (SpeechLMs) such as LL…