Nvidia's GB300 GPU shows 2.7x faster inference than GB200

By PulseAugur Editorial · [1 sources] · 2026-05-04 21:00

Nvidia's GB300 ultra NVL72 has demonstrated a 2.7x speed advantage over the GB200 NVL72 in inference tasks using the vLLM project's engine. This performance leap exceeds theoretical expectations based on the GB300's specifications, which include a 1.5x increase in NVFP4 FLOPs and HBM capacity, alongside identical HBM bandwidth compared to the GB200. AI

IMPACT This hardware advancement could accelerate AI model training and inference, potentially lowering costs and enabling more complex models.

RANK_REASON Announcement of a new hardware product (GB300 ultra NVL72) with significant performance improvements over its predecessor. [lever_c_demoted from significant: ic=1 ai=0.7]

Read on X — SemiAnalysis →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Nvidia's GB300 GPU shows 2.7x faster inference than GB200

COVERAGE [1]

X — SemiAnalysis TIER_1 English(EN) · SemiAnalysis_ · 2026-05-04 21:00

MINECRAFT STEVE ALERT: GB300 ultra NVL72 is already 2.7x faster 🚀 than GB200 NVL72 on one of the industry standard inference engine known as @vllm_project. On

MINECRAFT STEVE ALERT: GB300 ultra NVL72 is already 2.7x faster 🚀 than GB200 NVL72 on one of the industry standard inference engine known as @vllm_project. On paper, GB300 only has ~1.5x faster NVFP4 FLOP & 1.5x more HBM capacity & same HBM BW than GB200 but due to the f…

COVERAGE [1]

MINECRAFT STEVE ALERT: GB300 ultra NVL72 is already 2.7x faster 🚀 than GB200 NVL72 on one of the industry standard inference engine known as @vllm_project. On

RELATED ENTITIES

RELATED TOPICS