PulseAugur
实时 23:10:42

New theory explains dropout universality in neural networks

Researchers have developed a mean-field theory to understand dropout in neural networks, viewing it as a perturbation of critical signal propagation. The theory establishes distinct universality classes for smooth and ReLU-like activation functions, detailing their differing critical exponents and scaling behaviors. This framework also suggests optimal dropout scheduling strategies that can reduce test loss and improve accuracy without increasing computational cost, with predictions tested on MLPs and Vision Transformers. AI

影响 Provides a theoretical framework to optimize dropout scheduling, potentially improving model performance and efficiency.

排序理由 The cluster contains an academic paper detailing a new theoretical framework for understanding a machine learning technique.

在 arXiv stat.ML 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv stat.ML TIER_1 English(EN) · Lucas Fernandez Sarmiento ·

    Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

    arXiv:2605.21648v1 Announce Type: cross Abstract: We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at c…

  2. arXiv stat.ML TIER_1 English(EN) · Lucas Fernandez Sarmiento ·

    Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

    We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at critical initialization. We derive critical and cro…