significant · [1 source] · 2026-05-22 03:05 · 中文(ZH) 顶流里最快！智谱，你是在「喷」代码吧

Zhipu AI launches GLM-5.1 high-speed API at 400 tokens/sec

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Zhipu AI has released a high-speed API for its GLM-5.1 model, achieving 400 tokens per second. This new offering is positioned as the fastest among top-tier models and aims to enhance the user experience for AI agents by reducing wait times and increasing feedback frequency. The speed improvement is attributed to joint engineering efforts between Zhipu's GLM team and TileRT, focusing on optimizing the inference engine, scheduling system, and underlying infrastructure. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Accelerates AI agent responsiveness and real-time interaction capabilities, particularly in coding and content generation tasks.

RANK_REASON Model release from a frontier lab with performance metrics. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on 量子位 (QbitAI) →

COVERAGE [1]

量子位 (QbitAI) TIER_1 中文(ZH) · 十三 · 2026-05-22 03:05

The fastest among the top streams! Zhipu, are you 'spraying' code?

400 tokens/s

COVERAGE [1]

The fastest among the top streams! Zhipu, are you 'spraying' code?

RELATED ENTITIES

RELATED TOPICS