English(EN) Context, memory, and RAM/VRAM

本地 LLM 用户对 Qwen 27B 模型内存使用情况提出疑问

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-07 14:31

一位用户在本地运行大型语言模型时遇到了意外的内存（RAM）使用情况，尽管他们期望上下文缓存主要由显存（VRAM）处理。他们正在使用 Qwen 27B 模型，配合 llama.cpp 和一个内存扩展。用户注意到，随着上下文缓存的填充，系统内存（RAM）显著增加。用户希望了解内存（RAM）是否应该用于缓存，以及在推理过程中是什么进程导致了内存（RAM）消耗的增加。 AI

排序理由用户关于本地 LLM 资源管理的问题，并非新发布或重要的行业事件。

在 r/LocalLLaMA 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/UniqueIdentifier00 · 2026-06-07 14:31

上下文、记忆和 RAM/VRAM

<div class="md">This will be a slightly disorganized post, I apologize. I’m trying to understand the relationship between context, a memory system for the agent, RAM and VRAM. What I’ve been observing while watching my system performance while usin…

报道来源 [1]

上下文、记忆和 RAM/VRAM

相关实体

相关话题