gradient descent
PulseAugur coverage of gradient descent — every cluster mentioning gradient descent across labs, papers, and developer communities, ranked by signal.
14 day(s) with sentiment data
-
New framework analyzes gradient descent convergence in neural networks
Researchers have developed a new framework to analyze the convergence of gradient descent in neural networks, extending beyond the traditional neural tangent kernel (NTK) regime. This framework applies to a broad range …
-
Mathematician dismisses AI breakthroughs as predictable math
A mathematician argues that modern AI is not mathematically groundbreaking, but rather a predictable outcome of increased computing power, vast amounts of data, and financial incentives. The author criticizes the public…
-
New ML evaluation metric prioritizes computational effort over accuracy
A new research paper proposes a paradigm shift in evaluating machine learning models, moving beyond maximum accuracy to consider computational effort. The proposed metric, based on the number of gradient descent steps r…
-
New analysis details gradient descent performance in logistic regression
This paper analyzes the finite-sample performance of gradient descent in logistic regression with Gaussian design. The authors establish that gradient descent can achieve linear convergence to a small neighborhood of th…
-
New score matching method promises global convergence for generative models
Researchers have developed a new approach to score matching in generative modeling by utilizing reverse Fisher divergence instead of the standard forward Fisher divergence. This alternative objective demonstrates improv…
-
ReLU Activation's Impact on Gradient Descent Bias in Neural Networks Detailed
A new research paper explores how the ReLU activation function influences the implicit bias of gradient descent in high-dimensional neural network regression. The study, using a novel primal-dual analysis, demonstrates …
-
New research explores nonlinear dynamics stability in GD and SGD
Researchers have investigated the stability of nonlinear dynamics in gradient descent (GD) and stochastic gradient descent (SGD) optimization algorithms, moving beyond simplified quadratic potential assumptions. The stu…
-
New research tightens bounds on gradient descent for logistic regression · 2 sources tracked
Two new arXiv papers delve into the theoretical underpinnings of gradient descent for logistic regression. The first paper focuses on low-dimensional, separable data, providing tighter bounds on the convergence rate by …
-
New research frameworks model gradient descent at the edge of stability
Two new research papers explore the phenomenon of gradient descent operating at the edge of stability (EoS) in deep learning. The first paper introduces 'Edge Flow,' a system of differential equations that models gradie…
-
New neural network architectures tackle complex scientific computing problems · 8 sources tracked
Researchers are developing novel neural network architectures to solve complex partial differential equations (PDEs) and model dynamical systems. These include structure-oriented randomized neural networks (SO-RaNN) for…
-
New EM-NeSy approach enhances neurosymbolic AI learning
Researchers have introduced EM-NeSy, a novel approach to neurosymbolic learning that frames the process as an instance of the Expectation-Maximization (EM) algorithm. This method allows for approximate inference without…
-
Gradient Descent Outperforms Ridge Regression in Linear Models
A new research paper published on arXiv analyzes the performance of gradient descent (GD) compared to ridge regression and online stochastic gradient descent (SGD) in linear regression tasks. The study finds that GD con…
-
New framework reveals hierarchy in neural network training dynamics
Researchers have developed a new framework for understanding the training dynamics of feed-forward ReLU neural networks. Their work rewrites gradient descent not as a weight-space dynamic, but as a collective dynamic on…
-
Gradient Descent with Large Step Size Redistributes Signals in Deep Networks
Researchers have demonstrated that discrete Gradient Descent with a large step size leads to a different outcome than Gradient Flow in deep linear networks with multiple pathways. While Gradient Flow predicts a "winner-…
-
Deep Neural Networks Achieve Optimal Generalization Rates
Two new papers submitted to arXiv analyze the generalization performance of gradient descent methods in deep neural networks. The research establishes minimax-optimal rates for excess population risk in deep ReLU networ…
-
New model combines differential evolution and gradient descent for data representation
Researchers have developed a new Ensembled Latent Factor Model (ELFM-DEGDO) designed to better represent high-dimensional and incomplete data. This model uniquely combines differential evolution and gradient descent opt…
-
Deep ReLU networks achieve optimal generalization rates with gradient descent
Researchers have established optimal generalization rates for gradient descent in deep ReLU networks, a significant step beyond previous findings. The new work achieves rates comparable to the minimax optimal rates seen…
-
Spectral collapse hinders deep learning plasticity, researchers find
Researchers have identified spectral collapse as a key reason why deep neural networks lose plasticity when learning new tasks. This phenomenon occurs when the Hessian matrix loses effective curvature, rendering gradien…
-
Looped Transformers with Layer Norm Provably Learn Power Method
Researchers have theoretically demonstrated how looped transformers with layer normalization can learn the power method for principal component prediction. The study proves that such models, when trained with gradient d…
-
Researchers explore grokking phenomenon in ridge regression
Three new research papers explore the concept of "grokking" in machine learning, specifically within the context of ridge regression. One paper presents a numerical procedure to find optimal regularization strength, dem…