Researchers have introduced FOGO, a novel optimizer designed to combat forgetting during AI model training. FOGO addresses both short-term forgetting at each training step and long-term forgetting common in continual learning by detecting and resolving gradient interference. The optimizer uses spectral orthogonalization and a compact codebook memory to preserve past update directions, demonstrating improved convergence and knowledge retention across various tasks, including fine-tuning LLaVA-7B and pretraining GPT-2, outperforming existing optimizers like Adam and Muon. AI
影响 FOGO's ability to reduce forgetting could lead to more efficient and effective AI model training, particularly in continual learning scenarios.
排序理由 The cluster contains a research paper detailing a new optimization algorithm for AI models.
在 Hugging Face Daily Papers 阅读 →
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →