Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM
A user has completed the assembly of a powerful custom server designed for running large language models (LLMs). The build features an AMD EPYC 9575F processor, 768GB of RAM, and four NVIDIA RTX 3090 GPUs with a total of 96GB of VRAM. The server is intended for high-throughput inference using tools like vLLM for smaller models and llama.cpp for larger ones, with a planned application in a space simulation for AI-driven NPC planning. AI
IMPACT Enables local, high-performance LLM inference for advanced personal projects.