English(EN) Has there been any recent new development on which quant is considered optimal?

LLaMA 用户讨论本地模型的最佳量化方法

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-06 12:13

r/LocalLLaMA 子版块上的一场讨论探讨了当前大型语言模型的最佳量化方法。用户回忆起 q4 量化曾被认为是最好的，它在性能和 VRAM 使用之间取得了平衡，甚至被 Apple 用于设备上的应用程序。该帖子旨在确定是否有更新的量化技术在效率和质量上已经超越了 q4。 AI

排序理由用户在子版块上讨论模型量化，而非主要来源发布或重大行业事件。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/takuonline · 2026-06-06 12:13

关于哪个量化模型被认为是最佳的，近期有什么新的进展吗？

<div class="md">I recall in earlier days, q4 was said to be optimal. That is to say, if you have a: small q8 model medium q4 model large q2 Assuming they use the same amount of GPU VRAM, medium q4 would be the best-performing …