English(EN) Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

新的Unpack方法解析Transformer组件交互

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-22 09:03

研究人员开发了一种名为Unpack的新方法来分析Transformer模型的内部工作原理。该技术使用后向递归来追踪注意力层和MLP层等不同组件如何贡献于模型的输出。Unpack可以在单次前向传播中识别交互强度和每令牌归因，而无需干预或额外训练。 AI

影响提供了一种理解Transformer模型行为的新颖方法，可能有助于调试和提高模型的可解释性。

排序理由该集群包含一篇详细介绍Transformer模型分析新研究方法的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Po-Kai Chen, Niki van Stein, Aske Plaat · 2026-05-25 04:00

每个组件都是一个查找：来自单一分解的Token归因与组合

arXiv:2605.23393v1 Announce Type: cross Abstract: Mechanistic interpretability of transformers requires identifying not just which components matter but how they compose into the computational route that produced a prediction. Both attention and MLP follow a shared key-value temp…
arXiv cs.AI TIER_1 English(EN) · Aske Plaat · 2026-05-22 09:03

每个组件都是一个查找：来自单一分解的 Token 归因与组合

Mechanistic interpretability of transformers requires identifying not just which components matter but how they compose into the computational route that produced a prediction. Both attention and MLP follow a shared key-value template $φ(S)U$. We exploit this structure to develop…