New research reveals hidden states in LLMs contain task-solving information

作者 PulseAugur 编辑部 · [1 个来源] · 2026-04-28 04:00

Researchers have investigated the information encoded within the hidden states of language models during chain-of-thought (CoT) reasoning. By using activation patching on the GSM8K dataset, they found that individual CoT tokens contain task-relevant information that can significantly improve answer accuracy when transferred to a direct-answer generation process. This task-solving information is more concentrated in correct CoT runs and is unevenly distributed across tokens, appearing earlier in the reasoning trace and in mid-to-late model layers. The study also revealed that language tokens are more crucial for steering correct reasoning, while mathematical tokens primarily encode answer-proximal content. AI

影响 Provides new insights into how language models represent and fail during reasoning, potentially guiding future model development.

排序理由 Academic paper analyzing the internal workings of language model reasoning.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Houman Mehrafarin, Amit Parekh, Ioannis Konstas · 2026-04-28 04:00

When Chain-of-Thought Fails, the Solution Hides in the Hidden States

arXiv:2604.23351v1 Announce Type: new Abstract: Whether intermediate reasoning is computationally useful or merely explanatory depends on whether chain-of-thought (CoT) tokens contain task-relevant information. We present a mechanistic causal analysis of CoT on GSM8K using activa…

报道来源 [1]

When Chain-of-Thought Fails, the Solution Hides in the Hidden States

相关实体

相关话题