English(EN) Google Ships Gemma 4 QAT Checkpoints: Quantization-Aware Training

Google 发布采用量化感知训练的 Gemma 4 模型

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-13 11:18

Google 发布了其 Gemma 4 系列模型的新检查点，采用了量化感知训练（QAT）。此方法训练模型在权重被压缩到非常低的比特宽度（例如 4 位，甚至某些层的 2 位）时更加准确。目标是使这些模型能够在消费级硬件上高效运行，同时显著减小内存占用，例如 E2B 模型仅需约 1 GB。 AI

影响通过显著减小模型大小和内存需求，实现高效的设备端 AI。

排序理由 Frontier-lab 模型发布，附带系统卡。[lever_c_demoted from frontier_release: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

dev.to — LLM tag TIER_1 English(EN) · pueding · 2026-06-13 11:18

Google Ships Gemma 4 QAT Checkpoints: Quantization-Aware Training

 What: Google shipped quantization-aware-trained (QAT) checkpoints for the Gemma 4 family — open weights that were trained to survive being squeezed down to 4-bit (and 2-bit on the decode layers). …