PulseAugur
EN
LIVE 05:13:50

CoT fine-tuning harms LLM long-context recall; QK-Restore offers fix

Researchers have identified a significant issue where Chain-of-Thought (CoT) fine-tuning, intended to boost reasoning, inadvertently degrades long-context recall in hybrid Large Language Models. This problem, termed 'attention amnesia,' causes retrieval performance to drop substantially, particularly in models like HypeNet and Jet-Nemotron. To address this, a new training-free method called QK-Restore has been proposed, which selectively restores specific attention parameters from a pre-fine-tuning state. This technique successfully recovers long-context capabilities without compromising reasoning abilities. AI

IMPACT Addresses a critical limitation in LLM training, potentially enabling more robust long-context recall for advanced reasoning tasks.

RANK_REASON Academic paper detailing a novel method to address a specific LLM training issue. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Zhijiang Guo ·

    Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

    Chain-of-thought (CoT) supervised fine-tuning (SFT) is widely adopted to improve reasoning ability, yet we find that it systematically degrades long-context recall in hybrid linear-attention models. Across architectures including HypeNet and Jet-Nemotron, retrieval performance on…