PulseAugur
实时 10:31:35
English(EN) Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent

新框架解码深度学习现象:Grokking 和双重下降

研究人员开发了一个新框架来分析和解释深度神经网络中复杂的学习动态,特别是关注 Grokking 和双重下降等现象。该框架将学习分解为两个竞争过程:编码器内部的表示学习和最终分类器的读出校准。通过应用这种分解,该研究提供了对泛化的更细致的理解,区分了真实和虚假的改进,并为可解释性研究提供了诊断工具。 AI

影响 提供了一个统一的框架来理解和诊断神经网络中复杂的学习行为,有助于可解释性研究。

排序理由 该集群包含一篇学术论文,详细介绍了理解机器学习现象的新理论框架。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

新框架解码深度学习现象:Grokking 和双重下降

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Chi-Ning Chou, Oscar Uzdelewicz, Neng-Chun Chiu, Yao-Yuan Yang, SueYeon Chung ·

    Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent

    arXiv:2605.27078v1 Announce Type: cross Abstract: Training loss and accuracy are the standard signals used to monitor generalization during deep neural network training. Two well-documented phenomena complicate this picture: in grokking, train loss falls rapidly while test perfor…

  2. arXiv cs.AI TIER_1 English(EN) · SueYeon Chung ·

    Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent

    Training loss and accuracy are the standard signals used to monitor generalization during deep neural network training. Two well-documented phenomena complicate this picture: in grokking, train loss falls rapidly while test performance improves abruptly only after a long delay; i…