PulseAugur
实时 09:47:26

AdaMeZO optimizer cuts LLM fine-tuning memory needs with Adam-style estimates

Researchers have introduced AdaMeZO, a novel optimizer designed to make fine-tuning large language models more memory-efficient. Unlike traditional methods that require significant GPU memory for backpropagation, AdaMeZO utilizes a zeroth-order approach. It mimics the moment estimation of Adam but without the memory overhead, aiming to improve convergence speed over existing memory-saving techniques like MeZO. Experiments suggest AdaMeZO can achieve better performance with substantially fewer forward passes. AI

影响 Offers a more memory-efficient fine-tuning method for LLMs, potentially reducing hardware requirements for researchers and developers.

排序理由 The cluster contains an arXiv preprint detailing a new optimization method for LLM fine-tuning.

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

AdaMeZO optimizer cuts LLM fine-tuning memory needs with Adam-style estimates

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Zhijie Cai, Haolong Chen, Guangxu Zhu ·

    AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

    arXiv:2605.00650v1 Announce Type: new Abstract: Fine-tuning LLMs is necessary for various dedicated downstream tasks, but classic backpropagation-based fine-tuning methods require substantial GPU memory. To this end, a recent work, MeZO, which relies solely on forward passes to f…

  2. arXiv cs.AI TIER_1 English(EN) · Guangxu Zhu ·

    AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

    Fine-tuning LLMs is necessary for various dedicated downstream tasks, but classic backpropagation-based fine-tuning methods require substantial GPU memory. To this end, a recent work, MeZO, which relies solely on forward passes to fine-tune LLMs, significantly reduces GPU require…