v0.22.0rc3: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768)
vLLM has released version 0.22.0rc3, which includes a bug fix for a hard-coded timeout during multi-API-server startup. This release addresses issue #43768, aiming to improve the stability and reliability of the vLLM framework when managing multiple API servers simultaneously. The fix was co-authored by Nick Hill and tagged by Vadim Gimpelson. AI
IMPACT Improves the stability of the vLLM inference framework for multi-API server deployments.