A new theoretical analysis of two-layer ReLU neural networks trained with SGD reveals that the optimization process prioritizes learning spurious correlations over genuine signal features. The study demonstrates that SGD can learn these spurious features exponentially fast, and their presence can actively inhibit the learning of the true signal. The research identifies specific phase transitions in the learning dynamics, showing how the alignment of features and weight signs accelerates spurious learning, while large margins can suppress signal learning. AI
IMPACT Highlights a fundamental challenge in AI training, suggesting that current optimization methods may inherently favor shortcuts, impacting model reliability and generalization.
RANK_REASON Academic paper detailing theoretical analysis of neural network training dynamics. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →