PulseAugur
实时 02:20:26

新研究质疑 Transformer 在时间序列预测中的叠加机制

研究人员调查了用于时间序列预测的 Transformer 模型内部表示,发现像叠加这样的复杂机制对于获得有竞争力的性能并非必需。使用稀疏自编码器对 PatchTST 等模型进行的研究表明,即使在扩展字典和对潜在干预措施敏感度极低的情况下,表示也保持稀疏和稳定。同时,一项调查和一种名为 DyWPE 的新方法强调了基于 Transformer 的时间序列分析中位置编码的重要性,DyWPE 通过感知信号提高了准确性。 AI

影响 表明更简单的机制可能足以满足 Transformer 在时间序列任务中的需求,从而可能简化模型设计并提高效率。

排序理由 多篇 arXiv 论文讨论了用于时间序列预测的 Transformer 模型,包括机制可解释性和位置编码。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

新研究质疑 Transformer 在时间序列预测中的叠加机制

报道来源 [4]

  1. arXiv cs.LG TIER_1 English(EN) · Alper Y{\i}ld{\i}r{\i}m ·

    Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

    arXiv:2605.05151v1 Announce Type: new Abstract: Transformer architectures have been widely adopted for time series forecasting, yet whether the representational mechanisms that make them powerful in NLP actually engage on time series data remains unexplored. The persistent compet…

  2. arXiv cs.LG TIER_1 English(EN) · Habib Irani, Vangelis Metsis ·

    Positional Encoding in Transformer-Based Time Series Models: A Survey

    arXiv:2502.12370v3 Announce Type: replace Abstract: Recent advancements in transformer-based models have greatly improved time series analysis, providing robust solutions for tasks such as forecasting, anomaly detection, and classification. A crucial element of these models is po…

  3. arXiv cs.LG TIER_1 English(EN) · Habib Irani, Vangelis Metsis ·

    DyWPE: Signal-Aware Dynamic Wavelet Positional Encoding for Time Series Transformers

    arXiv:2509.14640v2 Announce Type: replace Abstract: Existing positional encoding methods in transformers are fundamentally signal-agnostic, deriving positional information solely from sequence indices while ignoring the underlying signal characteristics. This limitation is partic…

  4. arXiv cs.AI TIER_1 English(EN) · Alper Yıldırım ·

    Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

    Transformer architectures have been widely adopted for time series forecasting, yet whether the representational mechanisms that make them powerful in NLP actually engage on time series data remains unexplored. The persistent competitiveness of simple linear models such as DLinea…