PulseAugur
EN
LIVE 12:37:11

Modded Nvidia V100 server GPU runs LLMs efficiently for $200

A YouTuber successfully adapted an Nvidia Tesla V100 server GPU, originally designed for specialized sockets, into a standard PCIe card for consumer motherboards. This modification, costing around $200, allows the older Turing-architecture GPU to run large language models efficiently. In tests, the V100 outperformed newer cards like the RTX 3060 and RX 7800 XT in terms of tokens per second for AI inference, and demonstrated superior power efficiency when power-limited. AI

IMPACT Demonstrates that older, repurposed server hardware can offer competitive AI inference performance and efficiency, potentially lowering costs for AI operators.

RANK_REASON This is a hardware modification and repurposing of existing hardware, not a new product release from a manufacturer.

Read on Tom's Hardware →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Modded Nvidia V100 server GPU runs LLMs efficiently for $200

COVERAGE [1]

  1. Tom's Hardware TIER_1 English(EN) · Hassam Nasir ·

    $200 'socketed' Nvidia AI GPU for servers hacked into a PCIe card with custom PCB and 3D-printed cooling — modded Tesla V100 SMX data center GPU runs AI LLMs and is more efficient than many modern midrange offerings in AI inference

    Turns out, Nvidia's older Turing-era V100 AI GPU is still pretty capable today, even with just 16GB of VRAM. A YouTuber got his hands on the SMX variant for just $100, converted it to a PCIe x16 interface for another $100 with an adapter, and got some pretty impressive results ac…