English(EN) Google Just Killed Autoregressive AI Generation (DiffusionGemma)

Google DeepMind 发布 DiffusionGemma，并行文本生成速度提升 4 倍

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-19 15:00

Google DeepMind 推出了 DiffusionGemma，这是一种新颖的 LLM 架构，摒弃了传统的自回归式文本生成。该新模型采用离散文本扩散技术，可以同时对整个 token 块进行去噪和生成，而不是一次生成一个 token。据称，这种并行处理方法在专用 GPU 上可将推理速度提高高达四倍，并采用了混合专家（MoE）设计，从约 260 亿参数的主干模型中激活约 38 亿参数。该模型在 Apache 2.0 许可下开源，支持 Hugging Face Transformers 和 vLLM，易于部署。 AI

影响这种新的基于扩散的生成方法可能会显著加快 LLM 的推理速度，有可能改变实时 AI 应用的范式并降低计算成本。

排序理由前沿实验室模型发布，具有新颖的架构和性能声明。[lever_c_demoted from frontier_release: ic=1 ai=1.0]

在 dev.to — LLM tag 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Google DeepMind 发布 DiffusionGemma，并行文本生成速度提升 4 倍

报道来源 [1]

dev.to — LLM tag TIER_1 English(EN) · Hector Aryiku · 2026-06-19 15:00

Google Just Killed Autoregressive AI Generation (DiffusionGemma)

<p>Traditional Large Language Models (LLMs) are heavily bottlenecked by generating text one single token at a time. Every consecutive word requires a full forward pass through the network, capping inference efficiency and raising computational overhead. </p> <p>Google DeepMind’s …

报道来源 [1]

Google Just Killed Autoregressive AI Generation (DiffusionGemma)

相关实体

相关话题