PulseAugur
实时 15:50:27
English(EN) A Sharper Picture of Generalization in Transformers

新理论通过傅里叶频谱解释Transformer泛化能力

研究人员开发了一个新的理论框架来理解Transformer如何泛化,重点关注其目标函数的傅里叶频谱。该方法利用PAC-Bayes理论推导泛化界限,与之前基于Rademacher复杂度的研究方法形成对比。研究表明,集中在低度分量上的稀疏频谱有利于具有强大泛化能力的低锐度构造,并通过实证评估和可解释性研究得到了支持。 AI

影响 为理解和潜在改进Transformer的泛化能力提供了新的理论视角。

排序理由 该集群包含一篇详细介绍Transformer泛化能力理论研究的学术论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新理论通过傅里叶频谱解释Transformer泛化能力

报道来源 [3]

  1. arXiv cs.AI TIER_1 English(EN) · Paul Lintilhac, Sair Shaikh ·

    A Sharper Picture of Generalization in Transformers

    arXiv:2605.20988v1 Announce Type: cross Abstract: We study transformers' generalization behavior on boolean domains from the perspective of the Fourier Spectra of their target functions. In contrast to prior work (Edelman et al., 2022; Trauger and Tewari, 2024), which derived gen…

  2. arXiv cs.AI TIER_1 English(EN) · Sair Shaikh ·

    A Sharper Picture of Generalization in Transformers

    We study transformers' generalization behavior on boolean domains from the perspective of the Fourier Spectra of their target functions. In contrast to prior work (Edelman et al., 2022; Trauger and Tewari, 2024), which derived generalization bounds from Rademacher complexity, we …

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    A Sharper Picture of Generalization in Transformers

    We study transformers' generalization behavior on boolean domains from the perspective of the Fourier Spectra of their target functions. In contrast to prior work (Edelman et al., 2022; Trauger and Tewari, 2024), which derived generalization bounds from Rademacher complexity, we …