English(EN) New open-source voice model listens nonstop and decides every 0.4 seconds whether to speak or stay silent

开源音频交互模型实时处理语音

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-06 10:47

一款名为 Audio Interaction 的新开源语音模型已发布，它能够实时处理音频，无需等待输入结束。该模型可以持续进行翻译、转录和对话，甚至能识别咳嗽等环境声音。其代码和权重已在 GitHub 上以开源许可证形式提供，训练数据稍后发布。 AI

影响在开源应用中实现连续、实时的语音交互和环境声音识别。

排序理由发布一款具有新颖实时处理能力的开源模型。 [lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

The Decoder TIER_1 English(EN) · Jonathan Kemper · 2026-06-06 10:47

新型开源语音模型可不间断收听，每0.4秒决定是否发言

<p><img alt="Abstract representation of colorful audio waveforms flowing and transforming through geometric structures." class="attachment-full size-full wp-post-image" height="1047" src="https://the-decoder.com/wp-content/uploads/2026/06/audio-interaction-model-generated-image-n…