NVIDIA releases free 550B Nemotron 3 Ultra model, requires datacenter for hosting

By PulseAugur Editorial · [1 sources] · 2026-06-18 06:00

NVIDIA has released Nemotron 3 Ultra, a 550 billion parameter open-weight model featuring a hybrid Mamba-Attention design and a 1 million token context window. The model weights are freely available under the OpenMDW-1.1 license, but self-hosting requires significant datacenter-class hardware, such as multiple H100 or H200 GPUs. For easier access, NVIDIA offers a hosted API that is compatible with the OpenAI protocol. AI

IMPACT This release provides a powerful open-weight model, but its demanding hardware requirements highlight the ongoing challenges of self-hosting large AI systems.

RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on dev.to — LLM tag →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

dev.to — LLM tag TIER_1 English(EN) · Creeta · 2026-06-18 06:00

NVIDIA's 550B finally lands: free to use, expensive to host

<p>NVIDIA shipped its biggest open-weight model yet, and the weights are free to download — but standing it up yourself is a datacenter project, not a weekend one.</p> <h2> Nemotron 3 Ultra: What Landed on June 4 </h2> <p>Nemotron 3 Ultra is a text-only, open-weight Mixture-of-Ex…

COVERAGE [1]

NVIDIA's 550B finally lands: free to use, expensive to host

RELATED ENTITIES

RELATED TOPICS