Google DeepMind researchers have discovered that Supervised Fine-Tuning (SFT) is the primary driver of safety properties in their Gemini models, rather than other training stages like Reinforcement Learning (RL). Experiments comparing pre-training-only versions of Gemini 3.1 Pro and Gemini 3 Flash with SFT to their production counterparts showed remarkably similar safety performance. This finding suggests that SFT is a high-leverage intervention point for improving model safety and behavior in future Gemini developments. AI
IMPACT Highlights SFT as a critical stage for ensuring AI safety, potentially guiding future development and evaluation strategies.
RANK_REASON Research update from a major AI lab detailing findings on model training and safety properties.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →