PulseAugur
EN
LIVE 05:00:09

Tensorion optimizer generalizes Muon for tensor-aware machine learning

Researchers have introduced Tensorion, a novel tensor-aware optimizer that generalizes the Muon optimizer by considering higher-order tensors. Unlike common optimizers that treat parameter blocks as unstructured vectors, Tensorion exploits the multilinear weight structure inherent in many machine learning models. The optimizer is designed around a linear minimization oracle that balances bounding the tensor spectral norm with computational tractability, reducing to operations on unfolding matrices. Experiments indicate that Tensorion can provide enhanced convergence and more stable gradient updates on tensor-based computer vision tasks compared to existing methods. AI

IMPACT Introduces a new optimization technique that may improve convergence and stability for tensor-based machine learning models.

RANK_REASON The cluster contains an academic paper detailing a new optimization algorithm for machine learning models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Tensorion optimizer generalizes Muon for tensor-aware machine learning

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Maxim Rakhuba ·

    Tensorion: A Tensor-Aware Generalization of the Muon Optimizer

    Common first-order optimizers, such as Adam, implicitly treat each parameter block as an unstructured vector, which disregards the multilinear weight structure present in many modern machine learning models. Recent work has shown that exploiting matrix structure can improve optim…