PulseAugur
LIVE 05:55:12
ENTITY Norm-Separation Delay Law

Norm-Separation Delay Law

PulseAugur coverage of Norm-Separation Delay Law — every cluster mentioning Norm-Separation Delay Law across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_14472 ·

    Convergence Rate Analysis of the AdamW-Style Shampoo: Unifying One-sided and Two-Sided Preconditioning

    A new theory, the Norm-Separation Delay Law, explains the phenomenon of grokking, where models generalize long after memorizing training data. Researchers demonstrated that grokking is a representational phase transitio…