Alibaba's new AI voice model, Fun-Realtime-TTS-Preview, has achieved a top global ranking on the Speech Arena benchmark, securing fifth place worldwide and first place in China. The model demonstrated strong performance across multiple voice capabilities, including speech-to-text (ASR), text-to-speech (TTS), and end-to-end voice understanding and conversation (Chat). Notably, Alibaba's ASR model also achieved the lowest word error rate in a separate evaluation, highlighting its accuracy in transcribing speech. AI
IMPACT Demonstrates advanced capabilities in voice AI, particularly for diverse languages and accents, potentially influencing future voice assistant development.
RANK_REASON Significant benchmark result for a major tech company's AI model, outperforming competitors.
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →