English(EN) The Implicit Bias of Adam and Muon on Smooth Homogeneous Neural Networks

Adam 和 Muon 优化器在神经网络中表现出隐式偏差

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-26 04:00

研究人员分析了动量优化器（如 Adam 和 Muon）应用于光滑齐次神经网络时的隐式偏差。他们的发现表明，在特定的学习率调度下，动量最速下降算法（包括 Muon、MomentumGD 和 Signum）可以近似作为最速下降轨迹。这种偏差导致这些算法倾向于相应的边际最大化问题的 KKT 点，其中 Adam 特别最大化了 L-infinity 边际。 AI

影响为优化器行为提供了理论见解，可能指导未来的模型训练策略。

排序理由这是一篇发表在 arXiv 上的研究论文，详细介绍了神经网络中优化器的理论分析和实验结果。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Eitan Gronich, Gal Vardi · 2026-05-26 04:00

Adam和Muon在光滑齐次神经网络上的隐式偏差

arXiv:2602.16340v3 Announce Type: replace Abstract: We study the implicit bias of momentum-based optimizers on smooth homogeneous models. We show that \textit{momentum steepest descent} algorithms like Muon (spectral norm), MomentumGD ($\ell_2$ norm), and Signum ($\ell_\infty$ no…

报道来源 [1]

Adam和Muon在光滑齐次神经网络上的隐式偏差

相关实体

相关话题