PulseAugur
EN
LIVE 16:55:11

Budget GPU advice sought for local LLM inference

A user on the r/LocalLLaMA subreddit is seeking advice on purchasing hardware for running large language models on a limited budget. They are considering either a Radeon VII with 32GB VRAM or two P100 GPUs offering a combined 48GB VRAM, both at a similar price point. The user is weighing the trade-offs between more VRAM and faster inference speeds, specifically asking about the utility of higher VRAM for Mixture-of-Experts (MoE) models at Q8 quantization and seeking recommendations for other suitable MoE models. AI

RANK_REASON User-generated content on a consumer hardware forum discussing personal budget choices for running open-source models.

Read on r/LocalLLaMA →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. r/LocalLLaMA TIER_1 English(EN) · /u/bdsmmaster007 ·

    Buy recommendations on a thight Budget to aid my RX 6800

    <!-- SC_OFF --><div class="md"><p>So after a few hours of reserach, im torn between getting either a radeon vii or 2 p100 (both options for roughly 240€).<br /> The Radeon would give me 32gb of vram and fast inferference, while the 2 p100 would give me a total of 48gb, but roughl…