Researchers explore grokking phenomenon in ridge regression

By PulseAugur Editorial · [4 sources] · 2026-05-27 16:12

Three new research papers explore the concept of "grokking" in machine learning, specifically within the context of ridge regression. One paper presents a numerical procedure to find optimal regularization strength, demonstrating near-optimal generalization. Another paper provides theoretical proofs for grokking in linear models trained with gradient descent and weight decay, suggesting it's a training condition rather than a fundamental flaw. The third paper connects stochastic resetting from physics to ridge regularization, showing how resetting to the origin can replicate the ridge estimator and exploring alternative spectral filters with different renewal laws. AI

IMPACT These papers offer theoretical insights into generalization and training dynamics, potentially informing the development of more robust machine learning models.

RANK_REASON The cluster contains multiple academic papers on a theoretical machine learning topic.

Read on Hugging Face Daily Papers →

paper
other

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

COVERAGE [4]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-27 16:12

Optimal ridge regularization revisited

We consider $L^2$-regularized linear (ridge) regression over a finite data sample $X$ with bounded covariance and linear prediction targets $y$ with additive isotropic noise of finite variance. We present an iterative procedure to compute the optimal regularization strength numer…
arXiv stat.ML TIER_1 (AF) · Mingyue Xu, Gal Vardi, Itay Safran · 2026-06-01 04:00

To Grok Grokking: Provable Grokking in Ridge Regression

arXiv:2601.19791v3 Announce Type: replace-cross Abstract: We study grokking, the onset of generalization long after overfitting, in a classical ridge regression setting. We prove end-to-end grokking results for learning over-parameterized linear regression models using gradient d…
arXiv stat.ML TIER_1 English(EN) · Petar Jolakoski · 2026-05-29 04:00

Ridge Regression from Poisson Resetting: A Renewal Perspective on Spectral Regularization

arXiv:2605.30059v1 Announce Type: cross Abstract: We connect stochastic resetting from non-equilibrium statistical physics with ridge regularization in statistical learning. For linear gradient flow, resetting to the origin at rate $r$ produces stationary mean $(X^\top X+rI)^{-1}…
arXiv stat.ML TIER_1 English(EN) · Petar Jolakoski · 2026-05-28 15:10

Ridge Regression from Poisson Resetting: A Renewal Perspective on Spectral Regularization

We connect stochastic resetting from non-equilibrium statistical physics with ridge regularization in statistical learning. For linear gradient flow, resetting to the origin at rate $r$ produces stationary mean $(X^\top X+rI)^{-1}X^\top y$, exactly the ridge estimator with penalt…

COVERAGE [4]

Optimal ridge regularization revisited

To Grok Grokking: Provable Grokking in Ridge Regression

Ridge Regression from Poisson Resetting: A Renewal Perspective on Spectral Regularization

Ridge Regression from Poisson Resetting: A Renewal Perspective on Spectral Regularization

RELATED ENTITIES

RELATED TOPICS