PulseAugur
实时 23:09:57

New research reveals hidden states in LLMs contain task-solving information

Researchers have investigated the information encoded within the hidden states of language models during chain-of-thought (CoT) reasoning. By using activation patching on the GSM8K dataset, they found that individual CoT tokens contain task-relevant information that can significantly improve answer accuracy when transferred to a direct-answer generation process. This task-solving information is more concentrated in correct CoT runs and is unevenly distributed across tokens, appearing earlier in the reasoning trace and in mid-to-late model layers. The study also revealed that language tokens are more crucial for steering correct reasoning, while mathematical tokens primarily encode answer-proximal content. AI

影响 Provides new insights into how language models represent and fail during reasoning, potentially guiding future model development.

排序理由 Academic paper analyzing the internal workings of language model reasoning.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New research reveals hidden states in LLMs contain task-solving information

报道来源 [1]

  1. arXiv cs.CL TIER_1 English(EN) · Houman Mehrafarin, Amit Parekh, Ioannis Konstas ·

    When Chain-of-Thought Fails, the Solution Hides in the Hidden States

    arXiv:2604.23351v1 Announce Type: new Abstract: Whether intermediate reasoning is computationally useful or merely explanatory depends on whether chain-of-thought (CoT) tokens contain task-relevant information. We present a mechanistic causal analysis of CoT on GSM8K using activa…