NVIDIA unveils Nemotron-TwoTower diffusion language model

By PulseAugur Editorial · [1 sources] · 2026-06-25 08:34

NVIDIA has introduced Nemotron-TwoTower-30B-A3B-Base-BF16, a novel diffusion-based language model. This model deviates from traditional token-by-token generation by employing a diffusion denoiser tower to process blocks of tokens concurrently. NVIDIA reports that this approach maintains nearly all of the quality of its autoregressive counterparts while significantly increasing generation speed. AI

IMPACT This novel diffusion-based approach could accelerate LLM generation speeds while maintaining high quality.

RANK_REASON Frontier-lab model release with system card. [lever_c_demoted from frontier_release: ic=1 ai=1.0]

Read on r/LocalLLaMA →

model release

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

NVIDIA unveils Nemotron-TwoTower diffusion language model

COVERAGE [1]

r/LocalLLaMA TIER_1 English(EN) · /u/nikhilprasanth · 2026-06-25 08:34

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1uf4azy/nvidia_has_released/"> <img alt="NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone." src="https://extern…

COVERAGE [1]

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

RELATED ENTITIES

RELATED TOPICS