新的LM-SPT方法增强语音分词，以实现更好的语言模型对齐

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员开发了一种新颖的语音分词方法LM-SPT，旨在改善语音与语言模型之间的对齐。与直接蒸馏特征或使用池化的先前方法不同，LM-SPT采用语义语音再合成蒸馏过程。这种间接监督方法鼓励创建与语言模型更对齐的专用语义单元，即使在降低的帧率下也能实现，并且在自动语音识别和文本到语音任务中表现出卓越的性能，同时不牺牲语音重建保真度。 AI

排序理由该集群包含一篇学术论文，详细介绍了一种新的语音分词方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Daejin Jo, Jeeyoung Yun, Byungseok Roh, Sungwoong Kim · 2026-06-16 04:00

LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization

arXiv:2506.16738v2 Announce Type: replace-cross Abstract: With the rapid progress of speech language models (SLMs), discrete speech tokens have emerged as a core interface between speech and text, enabling unified modeling across modalities. Recent speech tokenization approaches …

报道来源 [1]

LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization

相关实体

相关话题