PulseAugur
实时 03:17:11
English(EN) Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics

新论文重新评估SGD动力学,挑战布朗运动类比

一篇新论文挑战了随机梯度下降(SGD)噪声表现得像布朗运动的普遍假设。研究人员提出了一个替代模型,其中SGD动力学发生在由小批量采样引起的波动的损失景观中。该框架揭示了SGD在临界点附近的独特行为,特别是表明方差可能在接近平坦的方向上随时间增长,表明有效的扩散。 AI

影响 挑战了AI训练动力学中的一个基本假设,可能导致更细致的优化策略和对模型收敛的更好理解。

排序理由 该集群包含一篇学术论文,详细介绍了关于随机梯度下降动力学的新理论见解和经验证据。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Igor Ignashin, Anna Radovskaya, Andrew Semenov, Egor Lopatin, Stanislav Potapov, Aleksandr Kovalenko, Andrey Veprikov, Aleksandr Shestakov, Andrey Leonidov, Aleksandr Beznosikov ·

    Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics

    arXiv:2605.22644v1 Announce Type: new Abstract: Stochastic Gradient Descent (SGD) is commonly modeled as a Langevin process, assuming that minibatch noise acts as Brownian motion. However, this approximation relies on a continuous-time limit and a sqrt(eta) noise scaling that doe…

  2. arXiv cs.LG TIER_1 English(EN) · Aleksandr Beznosikov ·

    Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics

    Stochastic Gradient Descent (SGD) is commonly modeled as a Langevin process, assuming that minibatch noise acts as Brownian motion. However, this approximation relies on a continuous-time limit and a sqrt(eta) noise scaling that does not match the discrete SGD update at finite le…