Two new papers explore zeroth-order (ZO) optimization for fine-tuning large language models (LLMs). The first paper introduces a kernel perspective, showing that the approximation error depends on output size rather than parameter dimension, theoretically justifying ZO methods' scalability. The second paper investigates adaptive ZO optimizers, proposing MEAZO, a memory-efficient method that matches performance with reduced memory overhead. AI
影响 These theoretical advancements could enable more efficient and scalable fine-tuning of large language models.
排序理由 Two arXiv papers present novel theoretical and algorithmic contributions to zeroth-order optimization for LLM fine-tuning.
AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →