A new research paper explores the optimization geometry mismatch inherent in teacher forcing methods used for training recurrent neural networks (RNNs) on chaotic dynamical systems. The study compares the curvature of identity teacher forcing (ITF) with marginal likelihood in a probabilistic switching augmentation of almost-linear RNNs (AL-RNNs). Experiments with the Lorenz-63 system indicate that while windowed evidence fine-tuning can improve held-out evidence, it may degrade crucial dynamical quantities compared to models initially trained with ITF. AI
影响 This research may lead to more stable and accurate training methods for RNNs applied to complex, chaotic systems.
排序理由 Academic paper published on arXiv detailing theoretical and experimental findings in machine learning.
- Almost-Linear RNNs
- AL-RNNs
- arXiv
- Bayes
- Lorenz-63
- Machine Learning
- Teacher Forcing
- Recurrent Neural Networks
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →