What models you guys running on 8GB? 16GB VRAM? 24GB? 32GB? 48GB?
A discussion on the r/LocalLLaMA subreddit explores the practicalities of running large language models (LLMs) on consumer-grade hardware with varying amounts of VRAM. Users are sharing their experiences with models on systems ranging from 8GB to 48GB of VRAM, detailing their hardware configurations, KV cache and context management strategies, and the performance they achieve. The thread aims to consolidate user experiences to understand the current landscape of local LLM deployment. AI
IMPACT Provides practical insights for individuals looking to deploy LLMs on consumer hardware.