PulseAugur
EN
LIVE 08:48:02

CoT fine-tuning degrades LLM long-context recall; QK-Restore fixes it

Researchers have identified a significant issue where Chain-of-Thought (CoT) fine-tuning, intended to boost reasoning, inadvertently harms the long-context recall capabilities of hybrid linear-attention models. This degradation is particularly pronounced in models like HypeNet and Jet-Nemotron, where retrieval accuracy plummets after fine-tuning. To address this, a novel training-free method called QK-Restore has been developed, which selectively reverts the query-key projection parameters to their pre-fine-tuning state, effectively restoring long-context recall without compromising reasoning performance. AI

IMPACT This research offers a crucial fix for maintaining long-context capabilities in LLMs after reasoning-focused fine-tuning, potentially improving their utility in complex, long-document tasks.

RANK_REASON Academic paper detailing a novel method to address a specific LLM training issue.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

COVERAGE [3]

  1. arXiv cs.CL TIER_1 English(EN) · Xinyu Zhou, Boyu Zhu, Yi Xu, Zhiwei Li, Yingfa Chen, Huiming Wang, Zhijiang Guo ·

    Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

    arXiv:2606.11052v1 Announce Type: new Abstract: Chain-of-thought (CoT) supervised fine-tuning (SFT) is widely adopted to improve reasoning ability, yet we find that it systematically degrades long-context recall in hybrid linear-attention models. Across architectures including Hy…

  2. arXiv cs.CL TIER_1 English(EN) · Zhijiang Guo ·

    Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

    Chain-of-thought (CoT) supervised fine-tuning (SFT) is widely adopted to improve reasoning ability, yet we find that it systematically degrades long-context recall in hybrid linear-attention models. Across architectures including HypeNet and Jet-Nemotron, retrieval performance on…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

    Chain-of-thought supervised fine-tuning degrades long-context recall in hybrid linear-attention models by biasing attention gradients toward short-range patterns, but a training-free method called QK-Restore can restore long-context capabilities by reverting query-key projections…