Redditor uses 768GB of used Optane RAM to run 1T-parameter LLM locally

By PulseAugur Editorial · [3 sources] · 2026-05-23 11:20

A Redditor has successfully run a 1-trillion-parameter LLM, specifically Kimi K2.5, locally on a single GPU workstation by utilizing 768GB of second-hand Intel Optane Persistent Memory modules as RAM. This setup achieved approximately 4 tokens per second, a performance deemed impressive given the hardware's budget constraints. The use of discontinued Optane DIMMs highlights a potential market gap for affordable, high-capacity memory solutions for large language model inference, especially as DRAM prices fluctuate. AI

IMPACT Demonstrates a cost-effective method for running large LLMs locally, potentially influencing future hardware configurations for AI inference.

RANK_REASON User-driven application of existing hardware for a specific AI task.

Read on Tom's Hardware →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Redditor uses 768GB of used Optane RAM to run 1T-parameter LLM locally

COVERAGE [3]

Tom's Hardware TIER_1 English(EN) · Mark Tyson · 2026-05-23 11:20

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-trillion parameter LLM.
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-23 14:48

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 t

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-t…

LINKS tomshardware.com/…/enthusiast-runs-1-tril… tomshardware.com/tech-industry
r/singularity TIER_2 English(EN) · /u/Anen-o-me · 2026-05-24 04:24

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

<table> <tr><td> <a href="https://www.reddit.com/r/singularity/comments/1tm1u3l/768gb_of_cheap_intel_optane_dimm_memory_sticks/"> <img alt="768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install …

COVERAGE [3]

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 t

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

RELATED ENTITIES

RELATED TOPICS