A Bifurcation Theory Framework for Gradient Descent on the Edge of Stability
Researchers have developed a new framework using bifurcation theory to understand gradient descent's behavior on the Edge of Stability (EoS) in deep learning. This framework analyzes the dynamics of overparameterized neural networks by separating training into normal and tangent components relative to the minimizer manifold. The study demonstrates that stable EoS training emerges from a flip bifurcation in the normal direction, influenced by the first Lyapunov coefficient, while tangent dynamics lead to decreasing sharpness. Under specific assumptions about the loss landscape, the research proves convergence to the minimizing manifold at the EoS threshold, unifying and extending prior findings. AI
IMPACT Provides a theoretical framework to better understand and potentially control training dynamics in deep learning models.