English(EN) On Adaptivity in Zeroth-Order Optimization

新研究解释了零阶优化为何能扩展到大型语言模型

作者 PulseAugur 编辑部 · [3 个来源] · 2026-05-05 15:29

两篇新论文探讨了用于微调大型语言模型（LLMs）的零阶（ZO）优化。第一篇论文引入了核视角，表明近似误差取决于输出大小而非参数维度，从而从理论上证明了ZO方法的可扩展性。第二篇论文研究了自适应ZO优化器，提出了MEAZO，一种内存效率高的方法，在减少内存开销的同时保持了性能。 AI

影响这些理论上的进步可能使大型语言模型的微调更加高效和可扩展。

排序理由两篇arXiv论文提出了用于LLM微调的零阶优化方面的新理论和算法贡献。

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.LG TIER_1 English(EN) · Zhe Li, Bicheng Ying, Zidong Liu, Haibo Yang · 2026-05-06 04:00

Zeroth-Order 优化的学习动力学：核方法视角

arXiv:2605.03373v1 Announce Type: new Abstract: Classical optimization theory establishes that zeroth-order (ZO) algorithms suffer from a dimension-dependent slowdown, with convergence rates typically scaling with the model dimension compared to first-order methods. However, in c…
arXiv cs.LG TIER_1 English(EN) · Hassan Dbouk, Nidham Gazagnadou, Matthias Reisser, Christos Louizos · 2026-05-06 04:00

关于零阶优化中的适应性

arXiv:2605.03869v1 Announce Type: new Abstract: We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no conv…
arXiv cs.LG TIER_1 English(EN) · Christos Louizos · 2026-05-05 15:29

关于零阶优化中的适应性

We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no convergence advantage over well-tuned ZO-SGD, while …