PulseAugur
EN
LIVE 12:00:20

New method exploits weight-space symmetries for deep learning curvature approximation

Researchers have developed a new method for approximating the curvature of loss functions in large deep learning models by exploiting weight-space symmetries. This approach analytically averages over group actions that preserve the loss, allowing for the construction of structured Hessian approximations from single gradients. The computational cost and accuracy can be tuned by selecting a specific symmetry group, and this framework unifies existing methods like Shampoo/Muon. The technique has been validated on various network architectures and applied to second-order optimization benchmarks, including a small language model, with potential applications in areas like uncertainty estimation and continual learning. AI

IMPACT This new method could enable more efficient second-order optimization for large deep learning models, potentially speeding up training and improving performance on complex tasks.

RANK_REASON This is a research paper detailing a new theoretical method for approximating curvature in machine learning models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv stat.ML TIER_1 English(EN) · Artem Artemev, Rui Xia, Benjamin M. Boyd, Youjing Yu, Felix Dangel, Guillaume Hennequin, Alberto Bernacchia ·

    Exploiting weight-space symmetries for approximating curvature

    arXiv:2606.00442v1 Announce Type: cross Abstract: Many machine learning techniques rely on approximating a loss function's curvature, but this is notoriously hard to do at the scale of modern deep networks. Surprisingly, no previous work has exploited the curvature constraints th…