English(EN) Qwen3.5-Omni: Scaling Up, Toward Native Omni-Modal AGI

阿里巴巴的 Qwen3.5-Omni 为多模态大语言模型增加了音频和视频能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-03-29 20:00

阿里巴巴的 Qwen 团队发布了新一代全模态大语言模型 Qwen3.5-Omni，能够处理文本、图像、音频和视听内容。该系列模型包括 Plus、Flash 和 Light 版本，均支持 256k 上下文窗口，并能处理超过 10 小时的音频。其架构在推理和生成组件中均采用了混合注意力专家混合（MoE）方法。 AI

影响将大语言模型的能力扩展到原生的音频和视频处理，可能催生更复杂的 AI 代理和应用。

排序理由前沿实验室模型发布，附带系统卡。[lever_c_demoted from frontier_release: ic=1 ai=1.0]

在 Qwen tech blog 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Qwen tech blog TIER_1 English(EN) · QwenTeam · 2026-03-29 20:00

Qwen3.5-Omni：规模化，迈向原生全模态AGI

Qwen3.5-Omni is Qwen’s latest generation of fully omnimodal LLM, supporting the understanding of text, images, audio, and audio-visual content. Both the Thinker and Talker in Qwen3.5-Omni adopt the Hybrid-Attention MoE. Qwen3.5-Omni series includes Instruct versions in three size…

报道来源 [1]

Qwen3.5-Omni：规模化，迈向原生全模态AGI

相关实体

相关话题