English(EN) Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers

新方法识别Transformer中的注意力头电路

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-26 04:00

研究人员开发了一种名为Spectral Probe-Circuits的新型三步方法，用于识别预训练Transformer模型中的特定计算电路。该技术使用光谱信号根据注意力头的持续、内容相关的计算对其进行排名，而无需标签或归因梯度。该方法已在各种模型大小和架构中得到验证，成功识别了诸如归纳电路等关键电路，当这些电路被消融时，会导致在合成归纳任务上的性能显著下降。 AI

影响提供了一种理解模型内部计算的新方法，可能有助于可解释性和调试。

排序理由该集群包含一篇详细介绍分析AI模型新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Yongzhong Xu · 2026-05-26 04:00

Spectral Probe-Circuits：识别预训练 Transformer 中注意力头电路的三步法

arXiv:2605.24059v1 Announce Type: cross Abstract: We present a three-step recipe for identifying attention-head circuits in pretrained transformers. A per-head spectral signal -- the time-integrated participation ratio of each head's attention output -- ranks heads doing sustaine…

报道来源 [1]

Spectral Probe-Circuits：识别预训练 Transformer 中注意力头电路的三步法

相关实体

相关话题