PulseAugur
EN
LIVE 23:47:48

New theory explains dropout universality in neural networks

Researchers have developed a mean-field theory to understand dropout in neural networks, viewing it as a perturbation of critical signal propagation. The theory establishes distinct universality classes for smooth and ReLU-like activation functions, detailing their differing critical exponents and scaling behaviors. This framework also suggests optimal dropout scheduling strategies that can reduce test loss and improve accuracy without increasing computational cost, with predictions tested on MLPs and Vision Transformers. AI

IMPACT Provides a theoretical framework to optimize dropout scheduling, potentially improving model performance and efficiency.

RANK_REASON The cluster contains an academic paper detailing a new theoretical framework for understanding a machine learning technique.

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv stat.ML TIER_1 English(EN) · Lucas Fernandez Sarmiento ·

    Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

    arXiv:2605.21648v1 Announce Type: cross Abstract: We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at c…

  2. arXiv stat.ML TIER_1 English(EN) · Lucas Fernandez Sarmiento ·

    Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

    We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at critical initialization. We derive critical and cro…