English(EN) llama.cpp - Qwen3.6/3.5-MTP - Share your benchmarks t/s

llama.cpp 用户分享优化后的 Qwen3.6/3.5-MTP 模型基准测试

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 19:07

llama.cpp 项目为 Qwen3.6/3.5-MTP 模型进行了显著的优化和修复，最近的合并增强了性能。鼓励用户使用最新版本分享他们的基准测试，提供完整的命令细节以进行准确比较。目标是收集能产生最佳每秒 token 性能的优化命令。 AI

影响 llama.cpp 中的优化可能导致 Qwen 模型实现更快的本地推理，使硬件受限的用户受益。

排序理由用户生成的基准测试和对开源模型的优化讨论。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/pmttyji · 2026-06-03 19:07

llama.cpp - Qwen3.6/3.5-MTP - Share your benchmarks t/s

<div class="md"><p>I think the dust has settled(95+%) for Qwen3.6/3.5-MTP. After the initial PR, so much optimizations & fixes. Even sometime ago today, there's a MTP related PR got merged & released(<a href="https://github.com/ggml-org/llama.cpp/releases/t…