PulseAugur
LIVE 12:23:39
research · [2 sources] ·
0
research

Orthogonal neural networks show stable training with new theoretical explanation

Researchers have derived explicit layer-wise recursion relations for the tensors in the finite-width expansion of network statistics when using orthogonal initializations. This work provides a theoretical explanation for the stability of finite-width nonlinear networks initialized with orthogonal weights. The findings were validated experimentally through numerical solutions and analytical expansions that agreed well with Monte-Carlo estimates. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Provides a theoretical explanation for improved training performance in neural networks, potentially guiding future model architectures.

RANK_REASON This is a research paper published on arXiv detailing theoretical and experimental findings on neural network initialization.

Read on arXiv cs.LG →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 · Max Guillen, Jan E. Gerken ·

    Criticality and Saturation in Orthogonal Neural Networks

    arXiv:2605.06563v1 Announce Type: new Abstract: It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, wher…

  2. arXiv cs.LG TIER_1 · Jan E. Gerken ·

    Criticality and Saturation in Orthogonal Neural Networks

    It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, where the infinite-width statistics are supplemented…