A new research paper titled "Retaining by Doing" explores how to mitigate catastrophic forgetting in language models during post-training adaptation. The study compares supervised fine-tuning (SFT) with reinforcement learning (RL), finding that RL methods, which utilize on-policy data, result in less forgetting while maintaining comparable or superior performance on target tasks. This robustness is attributed to RL's mode-seeking nature, which helps preserve prior knowledge. The findings suggest that using approximately on-policy data could be an efficient strategy for reducing forgetting in practical applications. AI
IMPACT Suggests a more efficient method for adapting language models without sacrificing existing knowledge.
RANK_REASON The cluster contains an academic paper detailing research findings on language model behavior. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →