English(EN) Aurora: A Leverage-Aware Spectral Optimizer

Aurora优化器增强MLP训练，性能优于Muon

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-26 04:47

研究人员推出了一种名为Aurora的新型谱优化器，旨在解决矩阵参数中行范数不均匀的问题，尤其是在MLP层中。这个问题会导致神经元接收到的更新不足而变得无效。Aurora在保持动量矩阵理想几何特性的同时，强制执行矩阵参数更新中的行范数均匀性，在预训练实验中性能优于现有的Muon优化器。该新型优化器还在修改版的nanoGPT基准测试中取得了最先进的成果，并显示出训练非常宽的MLP层的潜力。 AI

影响 Aurora的改进可以实现更高效的更宽、更深神经网络的训练，可能加速AI领域的研究和开发。

排序理由该集群描述了一篇关于机器学习模型新型优化器的新研究论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Alec Dewulf, Dhruv Pai, Li Yang, Ashley Zhang, Ben Keigwin · 2026-06-29 04:00

Aurora: A Leverage-Aware Spectral Optimizer

arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive pers…
arXiv cs.LG TIER_1 English(EN) · Ben Keigwin · 2026-06-26 04:47

Aurora：一个感知杠杆的谱优化器

We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive persistently small updates and eventually do not con…

报道来源 [2]

Aurora: A Leverage-Aware Spectral Optimizer

Aurora：一个感知杠杆的谱优化器

相关实体

相关话题