Brief · PulseAugur

RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [4 sources]

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Researchers have identified that Chain-of-Thought (CoT) fine-tuning, while improving reasoning, significantly degrades long-context recall in hybrid linear-attention models. This issue, termed "attention amnesia," causes performance drops on tasks like Needle-In-A-Haystack. A new training-free method called QK-Restore has been proposed to fix this by restoring specific query-key projection weights from a pre-fine-tuning checkpoint, successfully recovering long-context capabilities without sacrificing reasoning performance. AI

IMPACT Addresses a critical issue in LLM fine-tuning, potentially enabling more robust long-context capabilities for advanced reasoning tasks.

QK-Restore
arXiv
Jet-Nemotron
Chain-of-Thought (CoT)
Hugging Face
Chain-of-Thought (CoT) fine-tuning
Pyrecall
Needle-In-A-Haystack (NIAH)