Brief · PulseAugur

TOOL · r/LocalLLaMA English(EN) · 3h

Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM

A user has completed the assembly of a powerful custom server designed for running large language models (LLMs). The build features an AMD EPYC 9575F processor, 768GB of RAM, and four NVIDIA RTX 3090 GPUs with a total of 96GB of VRAM. The server is intended for high-throughput inference using tools like vLLM for smaller models and llama.cpp for larger ones, with a planned application in a space simulation for AI-driven NPC planning. AI

IMPACT Enables local, high-performance LLM inference for advanced personal projects.

NVIDIA
AMD
RTX 3090
vLLM
AMD EPYC 9575F
llamacpp