PulseAugur
实时 16:44:46

Anon 优化器提供可调自适应性,在关键任务上表现优于 Adam 和 SGD

研究人员推出了一种名为 Anon 的新型优化器,旨在弥合 Adam 等自适应方法与 SGD 等非自适应方法之间的性能差距。Anon 具有可连续调节的自适应性,能够在其自适应性谱系中进行内插甚至外插,超越现有优化器的行为。该优化器采用增量延迟更新机制,以确保在其整个自适应性谱系上的收敛性,并在图像分类、扩散和语言建模任务上展示了卓越的性能。 AI

影响 推出了一种新的优化器,有望提高图像、扩散和语言任务中大型模型的训练效率和性能。

排序理由 学术论文,介绍了一种具有理论保证和实证结果的新型优化器。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

Anon 优化器提供可调自适应性,在关键任务上表现优于 Adam 和 SGD

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Yiheng Zhang, Kaiyan Zhao, Shaowu Wu, Yiming Wang, Jiajun Wu, Leong Hou U, Steve Drew, Xiaoguang Niu ·

    Anon:在真实光谱上推断优化器适应性

    arXiv:2605.02317v1 Announce Type: cross Abstract: Adaptive optimizers such as Adam have achieved great success in training large-scale models like large language models and diffusion models. However, they often generalize worse than non-adaptive methods, such as SGD on classical …

  2. arXiv cs.LG TIER_1 English(EN) · Xiaoguang Niu ·

    Anon:在真实光谱上推断优化器适应性

    Adaptive optimizers such as Adam have achieved great success in training large-scale models like large language models and diffusion models. However, they often generalize worse than non-adaptive methods, such as SGD on classical architectures like CNNs. We identify a key cause o…