Nvidia's GB300 ultra NVL72 has demonstrated a 2.7x speed advantage over the GB200 NVL72 in inference tasks using the vLLM project's engine. This performance leap exceeds theoretical expectations based on the GB300's specifications, which include a 1.5x increase in NVFP4 FLOPs and HBM capacity, alongside identical HBM bandwidth compared to the GB200. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT This hardware advancement could accelerate AI model training and inference, potentially lowering costs and enabling more complex models.
RANK_REASON Announcement of a new hardware product (GB300 ultra NVL72) with significant performance improvements over its predecessor. [lever_c_demoted from significant: ic=1 ai=0.7]