PulseAugur
实时 22:16:19

StableGrad stabilizes deep neural network training without batch normalization

Researchers have introduced StableGrad, a novel optimizer-level mechanism designed to control the scale of activations and gradients in deep neural networks. This method aims to prevent training instability without relying on traditional batch normalization, which can be problematic for applications like Physics-Informed Neural Networks (PINNs). StableGrad operates by adjusting weight-gradient imbalances after backpropagation but before the optimizer update, thereby preserving the network's forward pass and physical residual accuracy. Evaluations on deep PINNs and standard architectures like ResNet and EfficientNet demonstrated StableGrad's effectiveness in improving accuracy and stabilizing optimization, even when batch normalization is removed. AI

影响 Offers a new technique to stabilize deep neural network training, particularly beneficial for physics-informed models where standard normalization methods are unsuitable.

排序理由 The cluster contains a new academic paper detailing a novel method for training neural networks. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

StableGrad stabilizes deep neural network training without batch normalization

报道来源 [1]

  1. arXiv cs.AI TIER_1 English(EN) · Enrique S. Quintana-Ortí ·

    StableGrad: Backward Scale Control without Batch Normalization

    Training very deep neural networks requires controlling the propagation of magnitudes across depth. Without such control, activations and gradients may vanish, explode, or enter unstable regimes that make optimization fail. Modern architectures often mitigate this problem through…