English(EN) IG-Lens: Exact Additive Probability Attribution Across Transformer Layers via Telescoping Integrated Gradients

新的 IG-Lens 方法可精确归因 Transformer 层之间的 token 概率

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-30 04:00

研究人员开发了 IG-Lens，一种用于在仅解码器的 Transformer 模型中精确归因预测 token 概率到特定层的新颖方法。与提供近似或有偏估计的现有工具不同，IG-Lens 使用望远积分梯度在概率空间中提供精确的加性分解。这种方法考虑了 softmax 非线性，确保跨层的归因总和精确匹配预测概率的总变化。 AI

影响提供了一种更准确的方法来理解模型的内部行为，可能有助于调试和可解释性。

排序理由该集群包含一篇详细介绍分析 Transformer 模型新方法的 ist 论文。 [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

新的 IG-Lens 方法可精确归因 Transformer 层之间的 token 概率

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Duc Anh Nguyen · 2026-06-30 04:00

IG-Lens: Exact Additive Probability Attribution Across Transformer Layers via Telescoping Integrated Gradients

arXiv:2606.29693v1 Announce Type: new Abstract: We ask a simple question about decoder-only transformers: \emph{between which two layers is the probability of a predicted token actually produced?} Existing layer-wise readout tools answer only approximately. The logit lens and its…

报道来源 [1]

IG-Lens: Exact Additive Probability Attribution Across Transformer Layers via Telescoping Integrated Gradients

相关实体

相关话题