English(EN) Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost

Praxy Voice 以最小的干预实现了商业级印度语 TTS

作者 PulseAugur 编辑部 · [3 个来源] · 2026-04-28 09:50

研究人员开发了 Praxy Voice，一种使用预训练的非印度语模型来改进印度语文本到语音 (TTS) 的方法。该方法结合了用于脚本罗马化的 Brahmic Unified Phoneme Space (BUPS)、用于文本令牌预测器的 LoRA 适配器以及语音提示恢复技术。该方法在无需新的声码器训练或商业 TTS 数据的情况下，实现了泰卢固语、泰米尔语和印地语的商业级音频输出。 AI

影响能够通过最小的干预和无商业数据，利用现有模型创建高质量的印度语 TTS。

排序理由详细介绍 TTS 合成新方法的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.CL TIER_1 English(EN) · Venkata Pushpak Teja Menta · 2026-04-29 04:00

Praxy Voice：语音提示恢复 + BUPS，在零商业训练数据成本下，从冻结的非指示性基础模型实现商用级指示性 TTS

arXiv:2604.25441v1 Announce Type: cross Abstract: Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopted multilingual base (Chatterb…
arXiv cs.CL TIER_1 English(EN) · Venkata Pushpak Teja Menta · 2026-04-28 09:50

Praxy Voice：语音提示恢复+BUPS，为零商业训练数据成本的冻结非指示性基础模型提供商用级指示性TTS

Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopted multilingual base (Chatterbox, 23 languages) does not even tokenise Telugu or…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-04-28 09:50

Praxy Voice：语音提示恢复+BUPS，为零商业训练数据成本的冻结非指示性基础模型提供商用级指示性TTS

Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopted multilingual base (Chatterbox, 23 languages) does not even tokenise Telugu or…

报道来源 [3]

Praxy Voice：语音提示恢复 + BUPS，在零商业训练数据成本下，从冻结的非指示性基础模型实现商用级指示性 TTS

Praxy Voice：语音提示恢复+BUPS，为零商业训练数据成本的冻结非指示性基础模型提供商用级指示性TTS

Praxy Voice：语音提示恢复+BUPS，为零商业训练数据成本的冻结非指示性基础模型提供商用级指示性TTS

相关实体

相关话题