Adam
PulseAugur coverage of Adam — every cluster mentioning Adam across labs, papers, and developer communities, ranked by signal.
9 天有情绪数据
-
FG^2-GDN enhances long-context understanding with adaptive learning rates
Researchers have introduced FG$^2$-GDN, a novel approach to enhance long-context understanding in neural networks. This method improves upon existing Gated Delta Networks by replacing a scalar learning rate with a chann…
-
新的AdamO优化器增强了离线强化学习的稳定性和性能
研究人员推出了一种名为AdamO的新型优化器,旨在增强离线强化学习的稳定性。该优化器解决了“崩溃”问题,即时序差分更新中的错误可能导致Q值极端且不可用。AdamO通过引入正交约束来防止TD误差的放大,理论上保证了任务安全,同时保持了Adam的连续时间耗散动力学。实证结果表明,当与现有基线集成时,AdamO在各种离线强化学习基准测试中提高了稳定性和性能。
-
Anon 优化器提供可调自适应性,在关键任务上表现优于 Adam 和 SGD
研究人员推出了一种名为 Anon 的新型优化器,旨在弥合 Adam 等自适应方法与 SGD 等非自适应方法之间的性能差距。Anon 具有可连续调节的自适应性,能够在其自适应性谱系中进行内插甚至外插,超越现有优化器的行为。该优化器采用增量延迟更新机制,以确保在其整个自适应性谱系上的收敛性,并在图像分类、扩散和语言建模任务上展示了卓越的性能。
-
新理论探讨预训练和稀疏连接如何增强深度学习泛化能力
三篇新论文探讨了深度学习泛化能力的理论基础。其中一篇论文将预训练确定为弱到强泛化能力的关键因素,并通过预训练过程中的相变展示了其出现。另一篇研究了卷积网络中的稀疏连接如何通过处理低维块中的输入来提高泛化能力,为它们的优势提供了原则性解释。第三篇论文提出了一个非渐近理论,通过展示神经切线核如何划分输出空间、管理信号和噪声来解释泛化能力,并引入了一个提高训练效率和性能的实用目标。
-
AI advances in CAD automation and chatbot regulation move forward in 2026
The U.S. Senate Judiciary Committee has advanced the GUARD Act, which would require identity verification for users of AI chatbots. This bipartisan measure aims to protect minors from unregulated AI interactions. Separa…
-
AdamFusion launches AI copilot for Autodesk Fusion 360 CAD
Adam, an AI copilot for Autodesk Fusion 360, has been released, enabling users to control CAD operations through native agents. The tool integrates as an add-in for Fusion 360, with installation instructions provided fo…
-
AdaMeZO optimizer cuts LLM fine-tuning memory needs with Adam-style estimates
Researchers have introduced AdaMeZO, a novel optimizer designed to make fine-tuning large language models more memory-efficient. Unlike traditional methods that require significant GPU memory for backpropagation, AdaMeZ…
-
Mindstream founders detail their journey building an AI newsletter
Mindstream, an AI newsletter founded by Adam and Matt, is sharing its origin story. The co-founders left their jobs two years ago to pursue the venture full-time, aiming to simplify the complex AI landscape for readers.…
-
New research shows immediate derivatives suffice for online recurrent adaptation
Researchers have developed a new method for online recurrent adaptation that significantly reduces computational requirements. Their approach, termed 'Immediate Derivatives Suffice,' eliminates the need for propagating …
-
Researchers analyze Adam's tradeoffs and enhance SignSGD with hybrid switching strategy
Two new research papers explore advancements in optimization algorithms for machine learning. One paper provides a theoretical analysis of the Adam optimizer, detailing its performance under non-stationary objectives an…
-
Researchers discover hidden failure modes in Adam optimizer for continual learning
Researchers have identified a hidden failure mode when gradient modification techniques are combined with the Adam optimizer in continual learning scenarios. This issue, particularly prevalent with shared-routing projec…
-
Google AI unveils Nested Learning; OpenAI advances meta-learning and AI safety
Google Research has introduced "Nested Learning," a novel machine learning paradigm designed to address the challenge of catastrophic forgetting in continual learning. This approach views models as interconnected optimi…
-
Google AI 推出研究代理;OpenAI 详解网络训练和非线性计算
Google AI 推出了测试时扩散深度研究员 (TTD-DR),这是一个模仿人类研究过程的新颖框架,通过迭代起草和修改报告来利用检索到的信息。该方法将报告撰写建模为一个扩散过程,通过搜索驱动的去噪机制来完善初稿。OpenAI 还发表了几篇论文,详细介绍了训练大型神经网络的技术,包括数据、流水线和张量并行,以及探索由于浮点运算导致的深度线性网络的非线性计算特性。此外,OpenAI 还讨论了深度学习的基础设施考虑因素以及一种称为权重归一…