English(EN) How do i prevent llama.cpp from offloading on Swap?

用户寻求防止 llama.cpp 交换 KV 缓存

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-11 11:22

Reddit r/LocalLLaMA 版块的一位用户正在寻求有关如何防止 llama.cpp 软件将其 KV 缓存卸载到交换内存的建议。尽管使用了特定标志，但用户在 RAM 使用量接近 96GB 时仍会发生卸载，即使仍有部分容量可用。他们正在寻找更激进的方法来确保仅在 RAM 几乎耗尽时才发生卸载。 AI

排序理由这是 Reddit 上的用户支持问题，并非重要的行业事件。

在 r/LocalLLaMA 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/No_Algae1753 · 2026-06-11 11:22

How do i prevent llama.cpp from offloading on Swap?

<div class="md"><p>I have tried preventing this issue by using llama.cpp flags. However, I still have the issue: whenever I'm close to my 96GB of RAM, llama-server / llama.cpp decides to offload the KV cache onto my swap. This usually happens when I'm at 91-92GB of…

报道来源 [1]

How do i prevent llama.cpp from offloading on Swap?

相关实体

相关话题