PulseAugur
实时 01:57:49
实体 AdamW

AdamW

PulseAugur coverage of AdamW — every cluster mentioning AdamW across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
27
90 天内 27
发布 · 30天
0
90 天内 0
论文 · 30天
27
90 天内 27
层级分布 · 90 天
关系
情绪 · 30 天

10 天有情绪数据

最近 · 第 2/2 页 · 共 27 条
  1. TOOL · CL_20375 ·

    New MetaAdamW optimizer uses self-attention for adaptive learning rates

    Researchers have developed MetaAdamW, a novel optimizer that enhances adaptive learning rates and weight decay by employing a self-attention mechanism. This Transformer-based approach dynamically adjusts hyperparameters…

  2. TOOL · CL_18808 ·

    New FIBER optimizer enhances differential privacy for AI training

    Researchers have introduced FIBER, a novel differentially private optimizer designed to enhance the performance of models trained with temporally filtered gradients. FIBER addresses the issue of miscalibrated bias corre…

  3. TOOL · CL_26988 ·

    New FiBeR optimizer boosts private AI model training

    Researchers have developed FiBeR, a new differentially private optimizer designed to improve training performance for models that use temporal filtering on their gradients. This method addresses issues where standard DP…

  4. RESEARCH · CL_14472 ·

    Convergence Rate Analysis of the AdamW-Style Shampoo: Unifying One-sided and Two-Sided Preconditioning

    A new theory, the Norm-Separation Delay Law, explains the phenomenon of grokking, where models generalize long after memorizing training data. Researchers demonstrated that grokking is a representational phase transitio…

  5. RESEARCH · CL_10117 ·

    AdaFRUGAL paper introduces dynamic controls for memory-efficient LLM training

    Researchers have developed AdaFRUGAL, a new framework designed to make training Large Language Models (LLMs) more memory-efficient. Unlike previous methods that required manual tuning of hyperparameters, AdaFRUGAL autom…

  6. RESEARCH · CL_08353 ·

    New research reveals gradient-direction sensitivity in optimizers for AI models

    Researchers have identified a new method for analyzing how neural networks learn by examining loss gradients instead of optimizer updates. This approach, termed Gradient-Direction Sensitivity (GDS), reveals a stronger c…

  7. RESEARCH · CL_03546 ·

    New Rose optimizer offers low VRAM, fast convergence, and great results

    A new PyTorch optimizer named Rose has been released under the Apache 2.0 license. Developed by Matthew K., Rose is designed to be stateless, offering significantly lower VRAM usage compared to optimizers like AdamW, wi…