Русский(RU) Cколько железа нужно ИИ-агенту? Как мы считали ресурсы для on-premise LLM и почему калькуляторы ошиблись в 5 раз На связи Сергей Смирнов, AI-инженер и основател

AI Engineer Details On-Premise LLM Hardware Calculation Challenges

By PulseAugur Editorial · [1 sources] · 2026-06-12 10:52

An AI engineer details the challenges of accurately calculating hardware requirements for on-premise LLM deployments. Initial estimates using a popular calculator for a GPT-OSS-120B model on two RTX Pro 6000 Blackwell GPUs predicted 5000 tokens/sec, but real-world performance was five times slower. The article explains how to properly assess LLM resource needs, especially with non-standard hardware, and describes a rigorous testing process to provide clients with reliable performance guarantees. AI

IMPACT Highlights the difficulty in accurately provisioning hardware for on-premise AI, potentially impacting enterprise adoption costs and timelines.

RANK_REASON Article details a specific technical challenge and methodology for on-premise LLM deployment, akin to a technical paper or case study. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Mastodon — fosstodon.org TIER_1 Русский(RU) · [email protected] · 2026-06-12 10:52

How much iron does an AI agent need? How we calculated resources for on-premise LLM and why calculators were 5 times wrong. Sergey Smirnov, AI Engineer and Founder, is speaking.

Cколько железа нужно ИИ-агенту? Как мы считали ресурсы для on-premise LLM и почему калькуляторы ошиблись в 5 раз На связи Сергей Смирнов, AI-инженер и основатель LLMStart.ru. Один из самых частых вопросов от бизнеса: «Сколько и какого железа нужно, чтобы развернуть ИИ-агента у на…

LINKS habr.com/…/1046722 habr.com/…/articles

COVERAGE [1]

How much iron does an AI agent need? How we calculated resources for on-premise LLM and why calculators were 5 times wrong. Sergey Smirnov, AI Engineer and Founder, is speaking.

RELATED ENTITIES

RELATED TOPICS