Researchers have introduced FOGO, a novel optimizer designed to combat forgetting during AI model training. FOGO addresses both short-term forgetting at each training step and long-term forgetting common in continual learning by detecting and resolving gradient interference. The optimizer uses spectral orthogonalization and a compact codebook memory to preserve past update directions, demonstrating improved convergence and knowledge retention across various tasks, including fine-tuning LLaVA-7B and pretraining GPT-2, outperforming existing optimizers like Adam and Muon. AI
IMPACT FOGO's ability to reduce forgetting could lead to more efficient and effective AI model training, particularly in continual learning scenarios.
RANK_REASON The cluster contains a research paper detailing a new optimization algorithm for AI models.
Read on Hugging Face Daily Papers →
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →