Brief · PulseAugur

TOOL · dev.to — LLM tag English(EN) · 4h

Idle GPUs also burn money — a Kubernetes Operator that can scale large models down to zero

Hearth, a new Kubernetes operator, aims to reduce costs for self-hosting open-source LLMs by scaling them down to zero when not in use. It supports various hardware accelerators beyond NVIDIA, including Ascend chips, and allows model deployment through a single manifest. While currently in alpha and not production-ready, Hearth has successfully demonstrated its scale-to-zero functionality with NVIDIA GPUs and is working on Ascend backend validation. AI

IMPACT Reduces operational costs for self-hosted LLMs by optimizing GPU utilization.

NVIDIA
DeepSeek
LLMs
Qwen
Kubernetes
Hearth