Google has announced that its Gemma 4 model now operates up to three times faster due to the introduction of MTP drafters. This enhancement allows the model to predict and output multiple tokens simultaneously, significantly boosting inference speed while maintaining output quality and intelligence. The update focuses on improving model inference performance. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT Potential for faster AI model inference could accelerate development and deployment of AI applications.
RANK_REASON Announcement of performance improvements and new features for an existing model.