PulseAugur
EN
LIVE 20:24:41

Aurora optimizer enhances MLP training, outperforming Muon

Researchers have introduced Aurora, a novel spectral optimizer designed to address issues with non-uniform row norms in matrix parameters, particularly within MLP layers. This problem can lead to neurons receiving insufficient updates and becoming ineffective. Aurora enforces row-uniformity in matrix parameter updates while preserving desirable geometric properties of the momentum matrix, outperforming the existing Muon optimizer in pre-training experiments. The new optimizer also achieved state-of-the-art results on a modified nanoGPT benchmark and shows potential for training very wide MLP layers. AI

IMPACT Aurora's improvements could enable more efficient training of wider and deeper neural networks, potentially accelerating research and development in AI.

RANK_REASON The cluster describes a new research paper detailing a novel optimizer for machine learning models.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Aurora optimizer enhances MLP training, outperforming Muon

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Alec Dewulf, Dhruv Pai, Li Yang, Ashley Zhang, Ben Keigwin ·

    Aurora: A Leverage-Aware Spectral Optimizer

    arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive pers…

  2. arXiv cs.LG TIER_1 English(EN) · Ben Keigwin ·

    Aurora: A Leverage-Aware Spectral Optimizer

    We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive persistently small updates and eventually do not con…