PulseAugur
实时 23:22:50

Alibaba Qwen3.5 model offers real-time translation with voice cloning

Alibaba's Qwen team has released Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model that significantly reduces latency to 2.8 seconds. This new model expands language support to 60 input languages and 29 output languages, while also incorporating visual cues like lip movements to improve accuracy in noisy environments. A standout feature is its ability to clone the original speaker's voice in real-time for translated output, creating a more natural listening experience. AI

影响 Enhances real-time multilingual communication by reducing latency and improving accuracy through multimodal input and voice cloning.

排序理由 Model release from a major AI lab (Alibaba) with significant performance improvements and new capabilities. [lever_c_demoted from frontier_release: ic=2 ai=1.0]

在 MarkTechPost 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

Alibaba Qwen3.5 model offers real-time translation with voice cloning

报道来源 [3]

  1. MarkTechPost TIER_1 English(EN) · Asif Razzaq ·

    Alibaba Qwen Team Introduces Qwen3.5-LiveTranslate-Flash: Real-Time Multimodal Interpretation Across 60 Languages at 2.8-Second Latency

    <p>Alibaba's Qwen team has released Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model that processes audio and video simultaneously. The model covers 60 input languages and produces speech output in 29 languages at 2.8 seconds of latency. Key additions over th…

  2. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model that processes audio and video simultaneously. The model

    Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model that processes audio and video simultaneously. The model covers 60 input languages and produces speech output in 29 languages at just 2.8 seconds latency. Key features include r…

  3. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model processing audio and video simultaneously. The model cove

    Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model processing audio and video simultaneously. The model covers 60 input languages and produces speech output in 29 languages at just 2.8 seconds latency. Key features include real-…