NVIDIA releases quantized GLM-5.2 MoE model with 1M context

By PulseAugur Editorial · [1 sources] · 2026-06-22 19:55

NVIDIA has released the GLM-5.2 NVFP4 model, a quantized version of ZAI's GLM-5.2. This Mixture-of-Experts model is optimized for reasoning and coding tasks, featuring sparse attention and a 1 million token context length. The model is ready for deployment in AI agent systems, chatbots, and RAG applications, and is available under the MIT License. AI

IMPACT This quantized MoE model with a 1M context window could accelerate deployment in AI agent systems and RAG applications.

RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on Hugging Face Trending Models →

GLM-5.2
Hugging Face
MIT License
Nemotron-Competitive-Programming-v1
Nemotron-Math-v2
Nemotron-Science-v1
Nemotron-SFT-Agentic-v2
Nemotron-SFT-Instruction-Following-Chat-v2
Nemotron-SFT-Multilingual-v1
Nemotron-SFT-SWE-v2
NVIDIA
NVIDIA Blackwell
nvidia/GLM-5.2-NVFP4
SGLang
vLLM

model release
product

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

NVIDIA releases quantized GLM-5.2 MoE model with 1M context

COVERAGE [1]

Hugging Face Trending Models TIER_1 Português(PT) · nvidia · 2026-06-22 19:55

nvidia/GLM-5.2-NVFP4

text-generation · 441 downloads · 64 likes

COVERAGE [1]

nvidia/GLM-5.2-NVFP4

RELATED ENTITIES

RELATED TOPICS