PulseAugur
实时 09:24:02
English(EN) Tensorion: A Tensor-Aware Generalization of the Muon Optimizer

Tensorion 优化器泛化 Muon 以实现张量感知的机器学习

研究人员引入了 Tensorion,这是一种新颖的张量感知优化器,它通过考虑高阶张量来泛化 Muon 优化器。与将参数块视为非结构化向量的常见优化器不同,Tensorion 利用了许多机器学习模型固有的多线性权重结构。该优化器围绕一个线性最小化预言机设计,该预言机在平衡张量谱范数界限与计算可处理性之间进行权衡,并简化为对展开矩阵的操作。实验表明,与现有方法相比,Tensorion 在基于张量的计算机视觉任务上可以提供更快的收敛速度和更稳定的梯度更新。 AI

影响 引入了一种新的优化技术,可能会提高基于张量的机器学习模型的收敛速度和稳定性。

排序理由 该集群包含一篇详细介绍机器学习模型新优化算法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

Tensorion 优化器泛化 Muon 以实现张量感知的机器学习

报道来源 [3]

  1. arXiv cs.LG TIER_1 English(EN) · Vladimir Bogachev, Vladimir Aletov, Alexander Molozhavenko, Sergei Kudriashov, Maxim Rakhuba ·

    Tensorion: A Tensor-Aware Generalization of the Muon Optimizer

    arXiv:2606.25975v1 Announce Type: new Abstract: Common first-order optimizers, such as Adam, implicitly treat each parameter block as an unstructured vector, which disregards the multilinear weight structure present in many modern machine learning models. Recent work has shown th…

  2. arXiv cs.LG TIER_1 English(EN) · Maxim Rakhuba ·

    Tensorion: A Tensor-Aware Generalization of the Muon Optimizer

    Common first-order optimizers, such as Adam, implicitly treat each parameter block as an unstructured vector, which disregards the multilinear weight structure present in many modern machine learning models. Recent work has shown that exploiting matrix structure can improve optim…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Tensorion: A Tensor-Aware Generalization of the Muon Optimizer

    Common first-order optimizers, such as Adam, implicitly treat each parameter block as an unstructured vector, which disregards the multilinear weight structure present in many modern machine learning models. Recent work has shown that exploiting matrix structure can improve optim…