Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 5h

LoRA-Muon: Spectral Steepest Descent on the Low-Rank Manifold

Researchers have introduced LoRA-Muon, an optimization technique designed to improve the efficiency and effectiveness of Low-Rank Adaptation (LoRA) for deep learning models. This new method applies spectral steepest-descent rules to the low-rank setting, aiming to provide a more stable and performant alternative to existing LoRA tuning methods. LoRA-Muon demonstrates improved learning rate transferability across various model dimensions and can even outperform dense baselines in certain scenarios, offering a more memory-efficient approach. AI

Muon
AdamW
Shampoo
Low-Rank Adaptation
TinyShakespeare
Spectron
LoRA-RITE
LoRA-Muon