PulseAugur
EN
LIVE 03:42:08

User explores running large GLM5.2 models on multi-node CPU cluster

A user is inquiring about the feasibility of running large language models, specifically GLM5.2, on a cluster of four Dell C6525 servers. Each server is equipped with dual AMD EPYC 7702 processors, 512GB of RAM, and fast SSD storage, totaling 2TB of RAM and significant memory bandwidth across the four nodes. The user is exploring options for clustering these systems to either improve token speed or load larger model sizes, such as Unsloth 4-bit or 8-bit versions of GLM5.2, for use in agentic coding tasks. AI

RANK_REASON User-generated question about running a specific model on custom hardware, not a formal release or industry event.

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

User explores running large GLM5.2 models on multi-node CPU cluster

COVERAGE [1]

  1. r/LocalLLaMA TIER_1 English(EN) · /u/StartupTim ·

    Is it possible to run a giant model like GLM5.2 on this cluster (4x servers with 512GB RAM + dual AMD Epyc)? 16 channel memory should hit 409GB/s per node.

    <!-- SC_OFF --><div class="md"><p>Hey all,</p> <p>I have a piece of hardware laying around which is pretty fast from a traditional (non-GPU) server viewpoint. The hardware is the following:</p> <ul> <li>Dell C6525 Server with Quad Node (4x server blades) with the following:</li> …