Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions in LSTM Networks
Researchers have identified a novel 'multiple-descent' phenomenon in long short-term memory (LSTM) networks, where performance fluctuates through repeated up and down cycles after overtraining. Analysis indicates these performance cycles are linked to phase transitions between order and chaos in the model. Optimal training points are consistently found at the critical transition between these phases, with the best model performance typically occurring at the initial transition from order to chaos, where the 'edge of chaos' is widest, facilitating better exploration of weight configurations. AI
IMPACT This research reveals a new dynamic in neural network training, potentially offering insights into optimizing model performance and stability.