English(EN) Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

Transformer模型在低谱区域编码概念，在高方差区域编码语法

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 04:00

研究人员在Transformer表征中识别出一种双重几何结构，其中概念方向在谱系尾部反集中，而静态嵌入行对比则集中在高方差方向。这一现象在17个模型和4种语言对中均有观察，并通过对Gemma和Llama的SAE特征和线性探测器进一步证实。研究结果表明，Transformer在处理过程中可能将语义内容转移到谱系安静的区域，从而允许概念在较少的语法干扰下进行操作。 AI

影响识别出Transformer处理和存储语义信息的潜在机制，可能为未来的模型架构提供信息。

排序理由这是一篇发表在arXiv上的研究论文，详细介绍了Transformer表征方面的新发现。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Pratyush Acharya, Nuraj Rimal, Habish Dhakal · 2026-05-05 04:00

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

arXiv:2605.01609v1 Announce Type: new Abstract: We test whether the causal inner product of \citet{park2024linear} -- defined by the unembedding covariance $\Sigma$ -- enables cross-lingual concept transport. Across 17 models and 4 language pairs, a matched-spectrum randomization…

报道来源 [1]

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

相关实体

相关话题