Researchers have introduced a new framework called Muon to stabilize deep-learning optimization using spectral normalizations, particularly for matrix-shaped parameters. This work idealizes the continuous-time, vanishing-momentum training dynamics in a mean-field regime, representing wide models as probability measures on parameter space. The study defines Spectral Wasserstein distances and develops static Kantorovich and Benamou--Brenier formulations, offering a gradient-flow interpretation of normalized training dynamics. AI
影响 Introduces a novel mathematical framework for stabilizing deep learning optimization, potentially improving training dynamics for wide models.
排序理由 The cluster contains an academic paper detailing a new mathematical framework for deep learning optimization. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →