PulseAugur
实时 09:32:33
English(EN) Fisher-Geometric Sharpness and the Implicit Bias of SGD toward Flat Minima

新理论将深度学习的平坦性植根于黎曼几何

研究人员开发了一个新的理论框架,通过将平坦性的概念植根于黎曼几何来理解深度学习模型的泛化能力。该方法利用 Fisher 信息矩阵 (FIM) 来定义一个重参数化不变的锐度度量,解决了传统欧几里得度量的局限性。在 MNIST 和 CIFAR-10 数据集上的实验表明,这种新的黎曼锐度度量能够准确地跟踪泛化性能,并与关于 SGD 对更平坦极值偏置的理论预测一致。 AI

影响 为理解深度学习模型的泛化提供了更稳健的理论基础。

排序理由 该集群包含一篇详细介绍机器学习新理论框架和实验验证的学术论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

新理论将深度学习的平坦性植根于黎曼几何

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Md Sakir Ahmed, Kumaresh Sarmah, Hemen Dutta ·

    Fisher-Geometric Sharpness and the Implicit Bias of SGD toward Flat Minima

    arXiv:2606.20469v1 Announce Type: new Abstract: A widely held intuition in deep learning is that stochastic gradient descent (SGD) implicitly favors flat minima and that flat minima generalize better, but standard Euclidean measures of flatness such as the trace or maximum eigenv…

  2. arXiv cs.LG TIER_1 English(EN) · Hemen Dutta ·

    Fisher-Geometric Sharpness and the Implicit Bias of SGD toward Flat Minima

    A widely held intuition in deep learning is that stochastic gradient descent (SGD) implicitly favors flat minima and that flat minima generalize better, but standard Euclidean measures of flatness such as the trace or maximum eigenvalue of the loss Hessian are not invariant under…