NVIDIA has introduced a new family of diffusion language models (DLMs) called Nemotron-Labs Diffusion, designed to overcome the limitations of traditional autoregressive models. These DLMs generate text by creating multiple tokens in parallel and then iteratively refining them, allowing for faster generation and the ability to revise previous outputs. The models are available in 3B, 8B, and 14B parameter scales, with both base and instruction-tuned chat variants, and include a vision-language model. They offer three generation modes: standard autoregressive, diffusion-based block generation, and a self-speculation mode that combines diffusion drafting with autoregressive verification for speed and accuracy. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a new generation of language models that could significantly speed up text generation and improve revision capabilities, impacting latency-sensitive applications.
RANK_REASON New model family release from a major AI lab (NVIDIA). [lever_c_demoted from frontier_release: ic=1 ai=1.0]