A new research paper explores the optimization geometry mismatch inherent in teacher forcing methods used for training recurrent neural networks (RNNs) on chaotic dynamical systems. The study compares the curvature of identity teacher forcing (ITF) with marginal likelihood in a probabilistic switching augmentation of almost-linear RNNs (AL-RNNs). Experiments with the Lorenz-63 system indicate that while windowed evidence fine-tuning can improve held-out evidence, it may degrade crucial dynamical quantities compared to models initially trained with ITF. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT This research may lead to more stable and accurate training methods for RNNs applied to complex, chaotic systems.
RANK_REASON Academic paper published on arXiv detailing theoretical and experimental findings in machine learning.