This paper theoretically investigates how data geometry influences generalization in overparameterized neural networks trained below the edge of stability. It derives generalization bounds for two-layer ReLU networks, showing that these bounds adapt to the intrinsic dimension of data distributions. The research indicates that data distributions that are harder to shatter with ReLU activation thresholds lead to better generalization, while data concentrated on a sphere favors memorization. AI
影响 Provides theoretical insights into neural network generalization, potentially guiding future model architectures and training strategies.
排序理由 This is a theoretical research paper published on arXiv concerning neural network generalization. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →