ENTITY SGD

SGD

PulseAugur coverage of SGD — every cluster mentioning SGD across labs, papers, and developer communities, ranked by signal.

Total · 30d

59

59 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

58

58 over 90d

TIER MIX · 90D

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

16 day(s) with sentiment data

RECENT · PAGE 1/3 · 59 TOTAL

TOOL · CL_109993 · Jun 25 · 04:00

New SCENT algorithm improves optimization for entropic risk minimization

Researchers have developed a new algorithm called SCENT for compositional entropic risk minimization, a problem formulation involving Log-Expectation-Exponential functions. Existing methods for this type of optimization…
TOOL · CL_109883 · Jun 25 · 04:00

New framework reveals SGD limitations for multi-index models

A new framework has been developed to analyze the limitations of standard stochastic gradient descent (SGD) for multi-index models, which are functions dependent on low-dimensional projections of input data. This resear…
TOOL · CL_108098 · Jun 24 · 04:00

New decentralized AI training method finds flatter minima, beats centralized SGD

Researchers have developed a new decentralized training method called DSGD-AC that challenges the notion that decentralized learning is inherently inferior to centralized approaches. This method uses an adaptive consens…
TOOL · CL_105062 · Jun 22 · 06:57

New GRAIN algorithm tackles learning instability in large AI models

Researchers have introduced GRAIN, a novel training algorithm designed to address learning instability in large, overparameterized deep learning models. GRAIN replaces the standard mean aggregation of gradients with a m…
TOOL · CL_102520 · Jun 21 · 10:30

AI agent compares cross-border prices with 73% click-through rate · 2 sources tracked

This build log details the creation of a cross-border price comparison agent using BuyWhere MCP and OpenAI's Agents SDK. The agent aims to find the cheapest product offers across different regions and currencies, consid…
RESEARCH · CL_99964 · Jun 18 · 16:48

New theory grounds deep learning flatness in Riemannian geometry

Researchers have developed a new theoretical framework for understanding the generalization capabilities of deep learning models by grounding the concept of flatness in Riemannian geometry. This approach utilizes the Fi…
TOOL · CL_98196 · Jun 18 · 04:00

New research explores nonlinear dynamics stability in GD and SGD

Researchers have investigated the stability of nonlinear dynamics in gradient descent (GD) and stochastic gradient descent (SGD) optimization algorithms, moving beyond simplified quadratic potential assumptions. The stu…
RESEARCH · CL_97792 · Jun 17 · 15:19

Research paper analyzes compute efficiency and runtime tradeoffs for momentum methods

A new research paper explores the tradeoffs between serial runtime and compute efficiency for stochastic momentum methods like Heavy Ball (HB) and Accelerated SGD (ASGD). The study proves finite-dimensional lower bounds…
TOOL · CL_96921 · Jun 17 · 13:58

Machine Learning in Healthcare Course Syllabus Detailed

This document outlines a comprehensive curriculum for a Machine Learning in Healthcare course. It covers fundamental concepts like the distinction between machine learning and deep learning, various neural network archi…
TOOL · CL_96217 · Jun 17 · 04:00

New theory explains grokking in deep neural networks via L2 phase transitions

Researchers have developed a new theory explaining the phenomenon of "grokking" in deep neural networks, where a model abruptly begins to generalize after a period of overfitting. The study, published on arXiv, proposes…
RESEARCH · CL_97809 · Jun 16 · 20:14

Mixed-Precision CA-SGD Accelerates Training on GPUs

Researchers have developed a mixed-precision communication-avoiding SGD (CA-SGD) method for generalized linear models on GPUs. This approach aims to reduce communication bottlenecks in distributed training by amortizing…
TOOL · CL_93749 · Jun 16 · 04:00

New Schattor optimization methods unify SGD and Muon for deep learning

Researchers have introduced Schattor, a new family of adaptive optimization methods for deep learning that utilize Schatten norms. This framework unifies existing methods like SGD and Muon, addressing challenges posed b…
RESEARCH · CL_93364 · Jun 16 · 04:00

New research explores advanced sampling techniques for machine learning

Two new research papers explore advanced techniques for sampling from complex probability distributions, a critical task in machine learning. The first paper, submitted to arXiv, focuses on variance reduction methods li…
RESEARCH · CL_95803 · Jun 15 · 23:43

New Theory: SA-Adam Adaptivity Asymptotically Invisible

Researchers have published a paper detailing a theoretical analysis of adaptive optimization algorithms, specifically focusing on SA-Adam with momentum and non-convergent adaptive preconditioning. The study proves a non…
RESEARCH · CL_93674 · Jun 15 · 07:06

New research explores domain generalization methods, including simple baselines and novel optimizers

Researchers are exploring new methods for domain generalization (DG) and open domain generalization (ODG) in machine learning. One study demonstrates that simple DG methods like CORAL and MMD can be competitive with mor…
TOOL · CL_91486 · Jun 15 · 04:00

New framework tackles data heterogeneity in hierarchical federated learning

Researchers have developed a new framework for hierarchical federated learning that addresses the issue of data heterogeneity across different clusters. The proposed DC-HierSignSGD algorithm uses binary sign-based stoch…
RESEARCH · CL_90920 · Jun 12 · 08:43

Adam vs. SGD: No single factor explains performance gap, study finds

A new research paper explores the performance gap between the Adam and SGD optimization algorithms, finding that no single factor consistently explains the difference. The study indicates that the gap arises from comple…
RESEARCH · CL_91199 · Jun 11 · 00:00

On-Policy Distillation Updates Found to Be Sparse and Geometrically Distinct

A new research paper explores the mechanics of on-policy distillation (OPD), a post-training technique that combines on-policy student trajectories with dense teacher supervision. The study reveals that OPD updates are …
TOOL · CL_77356 · Jun 8 · 04:00

New research analyzes GD/SGD stability in discrete parameter spaces

Researchers have analyzed the generalization error and stability of gradient descent (GD) and stochastic gradient descent (SGD) algorithms when applied to discrete parameter spaces with rounding. Their findings indicate…
TOOL · CL_74964 · Jun 6 · 13:01

Karpathy revisits 1989 neural net, cuts errors with modern AI techniques

Andrej Karpathy recreated a 1989 neural network, achieving a 60% error reduction by applying modern deep learning techniques. He demonstrated that innovations like using cross-entropy loss instead of mean squared error,…