bosonai/higgs-audio-v3-tts-4b
Boson AI has released Higgs Audio v3 TTS, a text-to-speech model designed for conversational voice chat. The model supports over 100 languages, offering zero-shot voice cloning and fine-grained control over emotion, style, and prosody. It utilizes an autoregressive decoder with interleaved text and audio tokens, encoding audio into codebooks for processing. While released for research, commercial use requires a separate license, with strict prohibitions against unlawful applications. AI
IMPACT Provides advanced conversational TTS capabilities for research and potential commercial applications.