Deutsch(DE) FP16 on Qwen 3.6 27B

Qwen 3.6 27B FP16 与 Q8 量化性能的争论

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-29 12:33

Reddit 的 r/LocalLLaMA 子版块上的一位用户正在询问 Qwen 3.6 27B 模型 FP16 和 Q8 量化之间的性能差异。他们在自己的设置上遇到了 FP16 性能缓慢的问题，并希望了解权重和缓存是否存在显著差异。此外，用户还在询问在 Strix Halo 系统上进行编码任务时，该模型的预期每秒令牌数 (TPS)。 AI

影响关于模型量化和性能的讨论影响用户体验和硬件优化。

排序理由用户关于模型量化和性能的讨论。[lever_c_demoted from research: ic=1 ai=1.0]

在 r/LocalLLaMA 阅读 →

模型发布

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 Deutsch(DE) · /u/Forward_Jackfruit813 · 2026-05-29 12:33

FP16 on Qwen 3.6 27B

<div class="md"><p>Have there been any notable difference between Q8 and FP16 on both the weights and the cache? I know the jump to Q8 is significant. I would test myself, but FP16 on my setup is painfully slow.</p> <p>Also side question, is ~14TPS around the numbe…

报道来源 [1]

FP16 on Qwen 3.6 27B

相关实体

相关话题