Brief

last 24h

[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.LG English(EN) · 18h · [2 sources]

Fisher-Geometric Sharpness and the Implicit Bias of SGD toward Flat Minima

Researchers have developed a new theoretical framework for understanding the generalization capabilities of deep learning models by grounding the concept of flatness in Riemannian geometry. This approach utilizes the Fisher Information Matrix (FIM) to define a reparametrization-invariant measure of sharpness, addressing limitations of traditional Euclidean measures. Experiments on MNIST and CIFAR-10 datasets demonstrate that this new metric, Riemannian sharpness, accurately tracks generalization performance and aligns with theoretical predictions regarding SGD's bias towards flatter minima. AI

IMPACT Provides a more robust theoretical foundation for understanding generalization in deep learning models.
- CIFAR-10
- SGD
- Dinh et al.
- Hugging Face
- MNIST database
- K-FAC
- Fisher Information Matrix
- CatalyzeX
- Gotit.pub
- DagsHub
- alphaXiv
- ScienceCast
RESEARCH · arXiv stat.ML English(EN) · 2w · [3 sources]

Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks

A new research paper explores the relationship between model flatness and generalization in neural networks. Despite prior work suggesting symmetries render flatness a vacuous metric, this study demonstrates a connection for learning multi-index models with homogeneous neural networks. The research identifies specific classes of non-generalizing interpolators and proves that the "flattest" interpolators achieve low population loss, establishing a direct link between flatness and generalization across various activations and data distributions. AI

IMPACT Establishes a theoretical link between model flatness and generalization, potentially guiding future research in neural network optimization and design.

Brief

Fisher-Geometric Sharpness and the Implicit Bias of SGD toward Flat Minima

Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks