PulseAugur
EN
LIVE 05:25:42

User explores NVMe SSDs for large model hosting via PCIe 5

A user on r/LocalLLaMA is exploring the potential of using a server equipped with numerous PCIe 5 lanes to host large language models. The idea is to populate these lanes with NVMe SSDs, creating a high-bandwidth storage solution that could theoretically offer speeds competitive with VRAM for running models up to 1-2TB. The user questions why this approach isn't more common for self-hosting massive models. AI

RANK_REASON User-generated question about a hypothetical hardware configuration for LLMs, lacking concrete data or a specific event.

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. r/LocalLLaMA TIER_1 English(EN) · /u/StartupTim ·

    Dumb question: How would performance be if you took a used server with like 80 lanes pcie 5 and stuck NVMe on them for model run?

    <!-- SC_OFF --><div class="md"><p>So for LLMs, VRAM speed is king.</p> <p>But what if you bought a used server which had, for example, 80 lanes of pcie 5 available, and you bifurcated that to hold 40 SSDs @ 2x lanes, with each NVMe doing 15Gbps, that means a mirror of 40 2TB driv…