A 10 year old Xeon is all you need (for 26B-A4B MTP Drafters without GPU) https://point.free/blog/gemma-4-on-a-2016-xeon/ # HackerNews # Tech # AI
A 10-year-old Intel Xeon E5-2680 v4 processor, costing under $20, can run a 26 billion parameter model. This is achieved by using a technique called "Memory-Mapped Tensor Parallelism" (MTP) which offloads model weights to RAM instead of GPU VRAM. This method allows for efficient inference on older, less powerful hardware, making large models more accessible. AI
IMPACT Enables running large AI models on low-cost, older hardware, democratizing access to advanced AI capabilities.