PulseAugur
实时 19:00:39
English(EN) Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories

新研究揭示了AI模型优化器中的梯度方向敏感性

研究人员通过检查损失梯度而非优化器更新,发现了一种分析神经网络学习的新方法。这种方法称为梯度方向敏感性(GDS),揭示了特定特征方向与线性质心之间比之前观察到的更强的耦合。研究发现,GDS将测得的耦合度显著提高了1到2个数量级,为参数空间中的特征形成提供了更清晰的诊断。此外,使用GDS将注意力更新限制在秩3子空间,将模型的理解速度提高了约2.3倍。 AI

影响 引入了一种理解神经网络中特征形成的新型诊断方法,有可能提高训练效率。

排序理由 这是一篇详细介绍分析神经网络训练新诊断方法的学术论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新研究揭示了AI模型优化器中的梯度方向敏感性

报道来源 [3]

  1. arXiv cs.LG TIER_1 English(EN) · Yongzhong Xu ·

    Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories

    arXiv:2604.25143v1 Announce Type: new Abstract: We show that replacing the rolling SVD of AdamW updates with a rolling SVD of loss gradients changes the diagnostic by 1-2 orders of magnitude. Performing SVD on the loss gradient instead of the AdamW update increases the measured p…

  2. arXiv cs.LG TIER_1 English(EN) · Yongzhong Xu ·

    Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories

    We show that replacing the rolling SVD of AdamW updates with a rolling SVD of loss gradients changes the diagnostic by 1-2 orders of magnitude. Performing SVD on the loss gradient instead of the AdamW update increases the measured perturbative coupling between SED directions and …

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories

    We show that replacing the rolling SVD of AdamW updates with a rolling SVD of loss gradients changes the diagnostic by 1-2 orders of magnitude. Performing SVD on the loss gradient instead of the AdamW update increases the measured perturbative coupling between SED directions and …