Mistral AI has released Voxtral TTS, an open-weights text-to-speech model that rivals ElevenLabs in performance while being significantly more efficient. This 4B parameter model supports nine languages and utilizes a novel architecture combining auto-regressive semantic token generation with flow-matching for acoustic tokens. The release underscores Mistral's commitment to open research and expanding the frontier of multimodal AI capabilities. AI
排序理由 Release of an open-weights TTS model with novel architecture details discussed in a podcast and accompanying paper.
- ElevenLabs
- Flow Matching
- Guillaume Lample
- Latent Space
- Mistral AI
- Pavan Kumar Reddy
- Pixtral
- Voxtral TTS
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →