English(EN) Are these quants of QAT better than non-QAT? What do I use?

LLaMA 用户就 Gemma 4 31B 量化和硬件优化寻求建议

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-10 19:23

一位 r/LocalLLaMA 子版块的用户正在寻求关于优化其运行大型语言模型设置的建议，特别是 Gemma 4 31B 模型。他们正试图确定新推出的“QAT”（量化感知训练）模型版本是否优于他们当前未经过 unsloth 优化的版本。该用户还在询问最佳量化级别（例如 Q2_K、Q4_0）以及如何最好地利用他们的硬件，包括一块 3060 12GB GPU 和 32GB RAM，以实现更长的上下文长度并可能使用 MTP（多轮提示）。 AI

排序理由用户在小众子版块上生成的内容，讨论模型量化和硬件优化，并非重大的行业事件。

在 r/LocalLLaMA 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/ThrowawayProgress99 · 2026-06-10 19:23

Are these quants of QAT better than non-QAT? What do I use?

<div class="md"><p><a href="https://huggingface.co/mradermacher/gemma-4-31B-it-qat-q4_0-unquantized-i1-GGUF/tree/main">https://huggingface.co/mradermacher/gemma-4-31B-it-qat-q4_0-unquantized-i1-GGUF/tree/main</a></p> <p><a href="https://huggingface.co/mradermacher/…

报道来源 [1]

Are these quants of QAT better than non-QAT? What do I use?

相关实体

相关话题