English(EN) Thinking about grabbing 4x Ascend GX10s

用户考虑购买 4x Ascend GX10 GPU 以运行未来的开源 LLM

作者 PulseAugur 编辑部 · [1 个来源] · 2026-07-01 10:30

一位 Reddit r/LocalLLaMA 社区的用户正在考虑购买四块 Ascend GX10 GPU，以运行未来的开源大型语言模型，例如潜在的 "fable 5" 版本。他们引用了其他人使用类似硬件（4x DGX Sparks）运行 GLM5.2 的性能基准，指出在 128k 上下文窗口下，提示处理速度为 400-500 tokens/秒，输出速度约为 15 tokens/秒。尽管承认这速度不算飞快，但用户认为这是可用的，尤其是在量化的情况下，并希望为即将推出的模型做好准备。 AI

影响潜在用户正在评估运行未来开源 LLM 的硬件配置。

排序理由用户讨论关于运行 LLM 的硬件。

在 r/LocalLLaMA 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/LocalLLaMA TIER_1 English(EN) · /u/chikengunya · 2026-07-01 10:30

Thinking about grabbing 4x Ascend GX10s

<div class="md"><p>Some in this sub have tested GLM5.2 on 4x DGX Sparks (or Ascend GX10) with 400-500 tok/s prompt processing and ~15 tok/s output at 128k context. Not blazing fast, but usable imo, especially with quantization.</p> <p>My thinking: If there's an ope…

报道来源 [1]

Thinking about grabbing 4x Ascend GX10s

相关实体

相关话题