PulseAugur
LIVE 17:38:53
tool · [2 sources] ·

Redditor uses 768GB of used Optane RAM to run 1T-parameter LLM locally

A Redditor has successfully run a 1-trillion-parameter LLM, specifically Kimi K2.5, locally on a single GPU workstation by utilizing 768GB of second-hand Intel Optane Persistent Memory modules as RAM. This setup achieved approximately 4 tokens per second, a performance deemed impressive given the hardware's budget constraints. The use of discontinued Optane DIMMs highlights a potential market gap for affordable, high-capacity memory solutions for large language model inference, especially as DRAM prices fluctuate. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Demonstrates a cost-effective method for running large LLMs locally, potentially influencing future hardware configurations for AI inference.

RANK_REASON User-driven application of existing hardware for a specific AI task.

Read on Tom's Hardware →

Redditor uses 768GB of used Optane RAM to run 1T-parameter LLM locally

COVERAGE [2]

  1. Tom's Hardware TIER_1 · Mark Tyson ·

    768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

    A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-trillion parameter LLM.

  2. Mastodon — fosstodon.org TIER_1 · [email protected] ·

    768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 t

    768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-t…