Deep Neural Networks Achieve Optimal Generalization Rates

By PulseAugur Editorial · [4 sources] · 2026-06-04 23:04

Two new papers submitted to arXiv analyze the generalization performance of gradient descent methods in deep neural networks. The research establishes minimax-optimal rates for excess population risk in deep ReLU networks trained with GD and SGD, provided the network width scales appropriately with depth and sample size. These findings suggest that deep neural networks, with sufficient width, can achieve generalization rates comparable to kernel methods. AI

IMPACT Establishes theoretical underpinnings for deep learning generalization, potentially guiding future model development and analysis.

RANK_REASON Two academic papers published on arXiv detailing theoretical advancements in deep learning generalization.

Read on arXiv stat.ML →

paper
other

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

COVERAGE [4]

arXiv stat.ML TIER_1 English(EN) · Junyu Zhou, Puyu Wang, Yunwen Lei, Yiming Ying, Ding-Xuan Zhou · 2026-06-08 04:00

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

arXiv:2606.06764v1 Announce Type: new Abstract: Recent progress has been made in understanding the statistical generalization performance of gradient descent methods for overparameterized neural networks within the neural tangent kernel (NTK) regime. However, most of the existing…
arXiv stat.ML TIER_1 English(EN) · Junyu Zhou, Puyu Wang, Yunwen Lei, Marius Kloft, Yiming Ying · 2026-06-08 04:00

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

arXiv:2606.06772v1 Announce Type: new Abstract: Understanding the generalization performance of over-parameterized neural networks has become a central topic in deep learning theory. While recent advances, particularly works under the Neural Tangent Kernel (NTK) regime, have shed…
arXiv stat.ML TIER_1 English(EN) · Yiming Ying · 2026-06-04 23:31

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

Understanding the generalization performance of over-parameterized neural networks has become a central topic in deep learning theory. While recent advances, particularly works under the Neural Tangent Kernel (NTK) regime, have shed light on the behavior of shallow architectures,…
arXiv stat.ML TIER_1 English(EN) · Ding-Xuan Zhou · 2026-06-04 23:04

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

Recent progress has been made in understanding the statistical generalization performance of gradient descent methods for overparameterized neural networks within the neural tangent kernel (NTK) regime. However, most of the existing work on regression problems is limited to shall…

COVERAGE [4]

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

RELATED ENTITIES

RELATED TOPICS