PulseAugur
实时 06:22:10

Transformers' expressive power explained by new measure-theoretic framework

Researchers have introduced a new measure-theoretic framework to understand the expressive power of Transformer architectures in modeling contextual relations. This framework connects standard softmax attention to entropy-regularized optimal transport, viewing attention as a normalized affinity function. The study establishes a universal approximation theorem, demonstrating that Transformers can approximate arbitrary contextual relation rules, with the normalization method influencing the representation of these relations. AI

影响 Provides a theoretical foundation for Transformer capabilities, potentially guiding future architectural improvements.

排序理由 Academic paper introducing a new theoretical framework for understanding Transformer architectures.

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Transformers' expressive power explained by new measure-theoretic framework

报道来源 [1]

  1. arXiv cs.LG TIER_1 English(EN) · Demi\'an Fraiman ·

    On the Expressive Power of Contextual Relations in Transformers

    arXiv:2603.25860v2 Announce Type: replace-cross Abstract: Transformer architectures have achieved remarkable empirical success in modeling contextual relations, yet a clear understanding of their expressive power is still lacking. In this work, we introduce a measure-theoretic fr…