r/LocalLLaMA
PulseAugur coverage of r/LocalLLaMA — every cluster mentioning r/LocalLLaMA across labs, papers, and developer communities, ranked by signal.
5 天有情绪数据
-
Quantized Qwen3.6-27B model achieves 100k context on 16GB VRAM
A user on Reddit's r/LocalLLaMA has detailed a method for running the Qwen3.6-27B model on a system with 16GB of VRAM, achieving a context length of 100,000 tokens. The process involves creating a custom GGUF quantizati…
-
用户记录了强大的双 RTX 6000 构建在重负载下的表现
一位用户在 r/LocalLLaMA 子版块记录了其双 RTX 6000 GPU 构建的扩展基准测试。该系统由 1600W PSU 供电,在 CPU 利用率 100% 且 GPU 各自限制在 535W 时,墙壁功耗约为 1650W。用户专注于测试 CPU 散热器在负载下的热性能,并指出其温度保持在 95°C 左右,这表明功耗而非散热是该构建当前的限制因素。
-
Qwen 35B model outperforms 27B on coding tasks, offering 8x speed boost
A user on Reddit's r/LocalLLaMA shared a benchmark comparing two versions of the Qwen 3.6 model on a MacBook Pro with an M5 Pro chip and 64GB of RAM. The 35B A3B model, using a 4-bit quantization, significantly outperfo…
-
Qwen3.6 35b 模型在快速粒子系统代码生成方面表现出色
Reddit 的 r/LocalLLaMA 社区的一位用户分享了他们测试 Qwen3.6 35b a3b 模型的经验,并指出其令人印象深刻的速度和编码能力。该用户报告称,该模型仅出现了一个小的 ValueError 就成功生成了粒子系统的代码,他们认为这是一个积极的结果。他们正在社区寻求关于未来编码任务的建议,以便让模型进行测试。
-
GLM 5.1 achieves 40 tokens/sec locally on RTX 6000 Pro cards
A user on the r/LocalLLaMA subreddit has successfully optimized the GLM 5.1 model for local deployment, achieving impressive performance metrics. By applying specific patches to the sglang inference software and utilizi…
-
LocalLLaMA community celebrates the present as the future of AI
The r/LocalLLaMA subreddit is showcasing the current state of local large language model (LLM) deployment, with a post titled "This is where we are right now, LocalLLaMA." The accompanying image suggests significant adv…
-
r/LocalLLaMA implements new rules to combat AI-generated spam and low-effort posts
The r/LocalLLaMA subreddit, which has over one million weekly visitors, has updated its rules to combat increased spam and low-effort content. Key changes include implementing minimum karma requirements for users and re…