Researchers have published a paper investigating how Transformers compute algorithmic intermediates, using arithmetic tasks as a testbed. The study found that while a Transformer model achieved high accuracy on base-digit extraction, causal tests revealed that the identified internal representations of intermediates were not actually used in the computation path to the output. This highlights a divergence between what probes suggest a model represents and how it causally uses that information, even when explicit algorithmic hypotheses are available. AI
影响 Challenges current methods for understanding internal model computations, suggesting a need for more robust causal analysis beyond simple probing.
排序理由 The cluster contains an academic paper detailing novel research findings.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →