English(EN) A quick Gemma4 31B comparison (Q4_k_M, QAT, heretic)

Gemma 4 31B QAT 版本在长上下文方面表现出色

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-06 05:11

一位 r/LocalLLaMA 用户对比了 Gemma 4 31B 模型的三个版本：标准 UD 版本、“heretic”版本和 QAT 版本。标准版本在处理长上下文和复杂工具链时遇到困难，而“heretic”版本则更容易出错。然而，QAT 版本有效处理了 32k 上下文并完成了全部推理，所有任务均正确执行。 AI

影响 Gemma 4 31B 的 QAT 版本在长上下文处理方面表现出性能提升，预示着更强大的本地 LLM 部署潜力。

排序理由用户对不同模型量化和版本的对比。[lever_c_demoted from research: ic=1 ai=1.0]

在 r/LocalLLaMA 阅读 →

模型发布

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/Some-Cauliflower4902 · 2026-06-06 05:11

Gemma4 31B 快速对比 (Q4_k_M, QAT, heretic)

<div class="md"><p>No numbers. Not sure if anybody cares…</p> <p>I’ve run the UD version of Q4_k_m for a month. I talk to this model nicely, because it’s a functional nervous wreck. And initially I thought that might be an alignment thing, so I also have the hereti…

报道来源 [1]

Gemma4 31B 快速对比 (Q4_k_M, QAT, heretic)

相关实体

相关话题