PulseAugur
EN
LIVE 16:17:51

New paper re-evaluates SGD dynamics, challenging Brownian motion analogy

A new paper challenges the common assumption that Stochastic Gradient Descent (SGD) noise behaves like Brownian motion. Researchers propose an alternative model where SGD dynamics occur within a fluctuating loss landscape caused by minibatch sampling. This framework reveals distinct behaviors for SGD near critical points, particularly showing that variance can grow over time in nearly-flat directions, indicating effective diffusion. AI

IMPACT Challenges a fundamental assumption in AI training dynamics, potentially leading to more nuanced optimization strategies and better understanding of model convergence.

RANK_REASON The cluster contains an academic paper detailing new theoretical insights and empirical evidence regarding the dynamics of Stochastic Gradient Descent.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Igor Ignashin, Anna Radovskaya, Andrew Semenov, Egor Lopatin, Stanislav Potapov, Aleksandr Kovalenko, Andrey Veprikov, Aleksandr Shestakov, Andrey Leonidov, Aleksandr Beznosikov ·

    Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics

    arXiv:2605.22644v1 Announce Type: new Abstract: Stochastic Gradient Descent (SGD) is commonly modeled as a Langevin process, assuming that minibatch noise acts as Brownian motion. However, this approximation relies on a continuous-time limit and a sqrt(eta) noise scaling that doe…

  2. arXiv cs.LG TIER_1 English(EN) · Aleksandr Beznosikov ·

    Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics

    Stochastic Gradient Descent (SGD) is commonly modeled as a Langevin process, assuming that minibatch noise acts as Brownian motion. However, this approximation relies on a continuous-time limit and a sqrt(eta) noise scaling that does not match the discrete SGD update at finite le…