A performance analysis by SemiAnalysis indicates that NVIDIA's Blackwell GPUs exhibit a significant 61% regression when running the SGLang Qwen3.5 397B model due to unsupported NVLink multicast for confidential computing. This issue specifically impacts the ability to efficiently distribute computations across multiple GPUs, hindering performance for large language models. AI
IMPACT This hardware limitation could slow down the deployment and efficiency of large language models on next-generation NVIDIA hardware.
RANK_REASON Analysis of hardware performance regression on a specific model. [lever_c_demoted from research: ic=1 ai=0.7]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →