Transformer models encode concepts in quiet spectral regions, syntax in high-variance ones

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 04:00

Researchers have identified a dual geometry within transformer representations, where concept directions anti-concentrate in the spectral tail while static unembedding-row contrasts concentrate in high-variance directions. This phenomenon was observed across 17 models and 4 language pairs, with further evidence from SAE features and linear probes on Gemma and Llama. The findings suggest that transformers may move semantic content to spectrally quiet regions during processing, allowing concepts to be manipulated with less grammatical interference. AI

影响 Identifies a potential mechanism for how transformers process and store semantic information, which could inform future model architectures.

排序理由 This is a research paper published on arXiv detailing novel findings about transformer representations. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Pratyush Acharya, Nuraj Rimal, Habish Dhakal · 2026-05-05 04:00

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

arXiv:2605.01609v1 Announce Type: new Abstract: We test whether the causal inner product of \citet{park2024linear} -- defined by the unembedding covariance $\Sigma$ -- enables cross-lingual concept transport. Across 17 models and 4 language pairs, a matched-spectrum randomization…

报道来源 [1]

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

相关实体

相关话题