Two new research papers explore methods to mitigate catastrophic forgetting in language models during fine-tuning. One paper introduces Sparse Memory Finetuning (SMF), which adds memory layers and updates only heavily accessed rows, showing improved performance on a medical exam task with minimal loss of general capabilities. The other paper investigates Sharpness-Aware Minimization (SAM) and other pretraining optimization techniques, demonstrating that biasing towards flatter minima can significantly reduce forgetting across various model sizes and post-training scenarios. AI
影响 These techniques could lead to more robust and adaptable language models that retain general knowledge while learning new tasks.
排序理由 Two arXiv papers present novel methods for mitigating catastrophic forgetting in language models.
- LoRA
- MedMCQA
- MetaMath
- OLMo-2-1B
- Qwen-2.5-0.5B-Instruct
- SAM
- Sharpness-Aware Minimization
- Sparse Memory Finetuning
- TriviaQA
- WikiText
AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →