English(EN) MTP is nice and all, but what about PP speeds?

LocalLLaMA 用户报告 MTP 优化导致性能下降

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-01 13:00

一位 r/LocalLLaMA 子版块的用户在使用 Qwen 3.6 27B 模型并启用“MTP”（可能是多线程处理或类似优化）时，遇到了性能和 GPU 利用率显著下降的情况。用户指出，该问题与内存无关，而是处理速度的下降，并正在寻求对此行为的解释。他们推测潜在原因可能包括 PCIe 延长线引起的总线争用或 Vulkan API 的问题。 AI

排序理由关于模型和优化特定技术问题的、来自一个小众子版块的用户生成内容，缺乏更广泛的行业意义。

在 r/LocalLLaMA 阅读 →

其他

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/milpster · 2026-06-01 13:00

MTP是不错，但PP速度怎么样？

<div class="md"><p>I don't know for the rest of you, but with my setup, as soon as i enable MTP, the PP performance and GPU usage drops significantly for some reason. It's not as much a memory issue for me as it is declining performance.</p> <p>My setup is: 2x Rade…

报道来源 [1]

MTP是不错，但PP速度怎么样？

相关实体

相关话题