NVIDIA has introduced Nemotron-Labs-TwoTower, an open-weight diffusion language model designed to improve text generation throughput. This model separates the tasks of context representation and token denoising into two distinct "towers." By employing this architecture, Nemotron-Labs-TwoTower maintains nearly 99% of the quality of autoregressive models while achieving over 2.4 times the generation speed. AI
IMPACT This novel architecture could significantly accelerate inference for large language models, potentially lowering costs and enabling new real-time applications.
RANK_REASON Frontier-lab model release with system card [lever_c_demoted from frontier_release: ic=2 ai=1.0]
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →