Gradient Flow
PulseAugur coverage of Gradient Flow — every cluster mentioning Gradient Flow across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
Gradient Descent with Large Step Size Redistributes Signals in Deep Networks
Researchers have demonstrated that discrete Gradient Descent with a large step size leads to a different outcome than Gradient Flow in deep linear networks with multiple pathways. While Gradient Flow predicts a "winner-…
-
New theory generalizes regularization for wide neural networks
A new paper introduces a novel framework for understanding and generalizing regularization in wide neural networks. The research identifies that standard ridge regularization can distort the inductive bias of feature-le…
-
Beyond Linearity in Attention Projections: The Case for Nonlinear Queries
Researchers are exploring the fundamental mechanisms behind transformer attention, with new papers analyzing its gradient flow structure and dynamics. One study interprets attention as a gradient flow on a unit sphere, …