PulseAugur
EN
LIVE 12:17:01

New method exploits weight-space symmetries for loss curvature approximation

Researchers have developed a novel method for approximating the curvature of loss functions in large deep learning models by exploiting weight-space symmetries. This approach analytically averages over group actions that preserve the loss, enabling the construction of structured Hessian approximations from single gradients. The framework allows users to control the accuracy-cost trade-off by selecting specific symmetry groups and unifies existing methods like Shampoo/Muon. The technique has been validated on various architectures and applied to second-order optimization benchmarks, including a small language model, with potential applications in areas like uncertainty estimation and continual learning. AI

IMPACT This research could lead to more efficient training and better understanding of deep learning models by improving curvature approximation.

RANK_REASON The cluster contains an academic paper detailing a new research methodology.

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New method exploits weight-space symmetries for loss curvature approximation

COVERAGE [2]

  1. arXiv stat.ML TIER_1 English(EN) · Artem Artemev, Rui Xia, Benjamin M. Boyd, Youjing Yu, Felix Dangel, Guillaume Hennequin, Alberto Bernacchia ·

    Exploiting weight-space symmetries for approximating curvature

    arXiv:2606.00442v1 Announce Type: cross Abstract: Many machine learning techniques rely on approximating a loss function's curvature, but this is notoriously hard to do at the scale of modern deep networks. Surprisingly, no previous work has exploited the curvature constraints th…

  2. arXiv stat.ML TIER_1 English(EN) · Alberto Bernacchia ·

    Exploiting weight-space symmetries for approximating curvature

    Many machine learning techniques rely on approximating a loss function's curvature, but this is notoriously hard to do at the scale of modern deep networks. Surprisingly, no previous work has exploited the curvature constraints that arise from well known weight-space symmetries i…