Shampoo
PulseAugur coverage of Shampoo — every cluster mentioning Shampoo across labs, papers, and developer communities, ranked by signal.
4 day(s) with sentiment data
-
DASH optimizer speeds up Shampoo by up to 5.6x with GPU and root-finding innovations
Researchers have developed DASH, a significantly faster implementation of the Shampoo optimizer for machine learning. DASH utilizes batched block preconditioning to improve GPU utilization and introduces novel methods l…
-
LoRA-Muon: New Optimizer Boosts Deep Learning Fine-Tuning Efficiency
Researchers have introduced LoRA-Muon, an optimization technique designed to improve the efficiency and effectiveness of Low-Rank Adaptation (LoRA) for deep learning models. This new method applies spectral steepest-des…
-
New FOAM algorithm enhances Shampoo optimization efficiency
Researchers have introduced FOAM, a new adaptive algorithm designed to improve the efficiency of the Shampoo optimization method. Shampoo is known for its strong performance on large-scale benchmarks but suffers from hi…
-
New method exploits weight-space symmetries for loss curvature approximation
Researchers have developed a novel method for approximating the curvature of loss functions in large deep learning models by exploiting weight-space symmetries. This approach analytically averages over group actions tha…
-
New method boosts efficiency of neural network training algorithms
Researchers have developed a new method to reparametrize Shampoo and SOAP algorithms, improving their efficiency for training neural networks. This technique supports BFloat16 storage, which reduces memory usage, and mi…
-
Muon optimizer fails on convex Lipschitz functions, study finds
A new paper challenges the theoretical underpinnings of the Muon optimization algorithm, demonstrating that it does not converge on convex Lipschitz functions. The research suggests that Muon's practical success likely …
-
Layerwise LQR framework optimizes deep networks using geometry-aware control
Researchers have developed Layerwise LQR (LLQR), a new optimization framework for deep learning models. LLQR reformulates second-order optimization methods, like Newton's method, as a linear quadratic regulator problem.…
-
New theory unifies adaptive optimization methods for nonconvex machine learning
Researchers have developed a unified framework to analyze first-order optimization algorithms used in nonconvex machine learning. This framework encompasses popular methods like AdaGrad, AdaNorm, and variants of Shampoo…