PulseAugur
EN
LIVE 11:15:53

New study reveals SGD noise-covariance link to loss landscape curvature

Researchers have uncovered a new relationship between the noise introduced by Stochastic Gradient Descent (SGD) and the curvature of the loss landscape in deep learning models. Their findings indicate that this noise is not directly proportional to the Hessian of the loss, as previously assumed under specific conditions. Instead, the study reveals a more general connection where the SGD noise covariance is related to the expected value of per-sample Hessians, suggesting these two factors approximately commute rather than coincide. AI

IMPACT Provides a more accurate theoretical understanding of SGD noise and its interaction with loss landscape curvature, potentially guiding future optimization algorithm development.

RANK_REASON This is a research paper detailing theoretical findings and experimental validation on a machine learning optimization topic. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Yikuan Zhang, Ning Yang, Yuhai Tu ·

    On the Superlinear Relationship between SGD Noise Covariance and Loss Landscape Curvature

    arXiv:2602.05600v2 Announce Type: replace Abstract: Stochastic Gradient Descent (SGD) introduces anisotropic noise that is correlated with the local curvature of the loss landscape, thereby biasing optimization toward flat minima. Prior work often assumes an equivalence between t…