PulseAugur
实时 00:03:27
实体 AdamW

AdamW

PulseAugur coverage of AdamW — every cluster mentioning AdamW across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
27
90 天内 27
发布 · 30天
0
90 天内 0
论文 · 30天
27
90 天内 27
层级分布 · 90 天
关系
情绪 · 30 天

10 天有情绪数据

最近 · 第 1/2 页 · 共 27 条
  1. RESEARCH · CL_48587 ·

    New optimizer SF-NorMuon matches AdamW performance without schedules

    Researchers have developed SF-NorMuon, a new schedule-free spectral optimizer that matches or surpasses the performance of traditional AdamW optimizers. This advancement addresses a key limitation in current anytime tra…

  2. RESEARCH · CL_42129 ·

    New research enables efficient hyperparameter transfer for large neural networks

    Researchers have developed new methods for hyperparameter transfer, enabling more efficient scaling of large neural networks. One paper introduces a parameterization justified by dynamical mean-field theory, allowing re…

  3. TOOL · CL_41851 ·

    New HORST optimizer enhances sparse transformer training

    Researchers have developed HORST, a novel optimizer designed to improve the training of sparse transformers. Standard optimizers struggle to balance the need for sparsity with training stability. HORST addresses this by…

  4. RESEARCH · CL_44881 ·

    Optimizer choice dramatically alters Transformer scaling laws, research finds

    A new research paper demonstrates that the choice of optimizer significantly impacts a Transformer model's capacity and scaling laws, even when the architecture remains identical. The study found that the Muon optimizer…

  5. TOOL · CL_40880 ·

    LionMuon optimizer cuts training cost for large models

    Researchers have introduced LionMuon, a novel optimization algorithm designed for efficient training of large-scale models. This method alternates between the low-cost updates of Lion and the stronger, albeit more expen…

  6. RESEARCH · CL_39993 ·

    New optimizers AMUSE, MiMuon, and Pion enhance deep learning training

    Researchers have developed several new optimization techniques to improve deep learning model training. AMUSE combines the rapid adaptation of Muon with the stability of Schedule-Free averaging, eliminating the need for…

  7. RESEARCH · CL_38176 ·

    Ringmaster LMO method improves asynchronous neural network training

    Researchers have developed Ringmaster LMO, a novel asynchronous method for training neural networks that addresses inefficiencies in distributed systems. This approach builds upon the delay-thresholding concept to manag…

  8. RESEARCH · CL_38177 ·

    New optimizers respect neural network symmetries, improve training

    Researchers have introduced a new principle for designing optimizers in deep learning that aligns with the inherent symmetries of neural network architectures. Unlike current optimizers like Adam, which operate on param…

  9. TOOL · CL_36032 ·

    New optimizer ML-FOP-SOAP enhances multimodal AI training stability

    Researchers have developed a new second-order optimization framework called ML-FOP-SOAP to address modality competition in multimodal AI models. This method aims to stabilize training and improve large-batch scaling by …

  10. RESEARCH · CL_36602 ·

    New research explores advanced optimization for machine learning

    Several recent research papers explore advanced optimization techniques for machine learning. One paper introduces a derivative-free consensus-based method for nonconvex bi-level optimization, demonstrating convergence …

  11. RESEARCH · CL_32651 ·

    New DBS-Adam optimizer improves deep learning for imbalanced data

    Researchers have developed a new optimization algorithm called Dynamic Batch-Sensitive Adam (DBS-Adam) designed to improve the training of deep learning models, particularly those dealing with imbalanced and sequential …

  12. RESEARCH · CL_28033 ·

    Tilde Research launches Aurora optimizer to fix neuron death in Muon

    Tilde Research has introduced Aurora, a novel optimizer designed to train neural networks more effectively. Aurora addresses a critical issue in the popular Muon optimizer where a significant number of neurons become pe…

  13. RESEARCH · CL_29333 ·

    Paper details uniform scaling limits in AdamW-trained transformers

    Researchers have published a paper detailing uniform scaling limits in transformers trained with the AdamW optimizer. The study models hidden-state dynamics as an interacting particle system, demonstrating convergence t…

  14. RESEARCH · CL_28256 ·

    Muown optimizer improves LLM training by controlling row-norm drift

    Researchers have developed Muown, a novel optimization method designed to improve the training of large language models. Muown addresses issues with the Muon optimizer, specifically the upward drift of spectral norms in…

  15. TOOL · CL_27538 ·

    New research links optimizers to mode connectivity in neural networks

    Researchers have explored the role of optimizers in mode connectivity within neural networks, a concept previously underexplored. Their work demonstrates that solutions generated by a single optimizer, such as AdamW or …

  16. TOOL · CL_25579 ·

    OrScale optimization method improves neural network training

    Researchers have introduced OrScale, a novel optimization technique designed to enhance neural network training. OrScale builds upon the Muon method by incorporating layer-wise trust-ratio scaling, which measures the Fr…

  17. TOOL · CL_22088 ·

    New principle optimizes AI model training by aligning gradients and updates

    Researchers have introduced a new principle called Greedy Alignment for selecting and tuning optimizer hyperparameters in machine learning. This principle treats optimizers as causal filters that map gradients to update…

  18. RESEARCH · CL_22113 ·

    New research links optimizer choice to reduced forgetting in LLM finetuning

    Researchers have explored the impact of optimizer consistency during the fine-tuning of large language models. One study suggests that using the same optimizer for both pre-training and fine-tuning leads to less knowled…

  19. RESEARCH · CL_22009 ·

    GONO optimizer adapts Adam's momentum using directional consistency for better convergence

    Researchers have introduced the GONO framework, an optimization signal designed to improve deep learning training by addressing the decoupling of directional alignment and loss convergence. Unlike existing optimizers th…

  20. TOOL · CL_21042 ·

    Meta AI launches NeuralBench to standardize brain signal AI model evaluation

    Meta AI has introduced NeuralBench, an open-source framework designed to standardize the evaluation of AI models that analyze brain signals. The initial release, NeuralBench-EEG v1.0, is the most extensive benchmark of …