Nemotron 3 Ultra went live June 4. Here's the call that works.
NVIDIA has released Nemotron 3 Ultra, a 550-billion-parameter open-weights model that sets a new benchmark for US-based releases. This hybrid Mamba-Transformer mixture-of-experts model features a 1M-token context window and is optimized for agent harnesses. While it achieves a high score on the Artificial Analysis Intelligence Index, it trails behind some Chinese and closed-source models in raw capability but excels in speed, processing over 300 tokens per second. AI
IMPACT Sets a new high-water mark for US open-weights models, particularly in speed, potentially influencing agent development.