Orthogonal neural networks show stable training with new theoretical explanation

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

Researchers have derived explicit layer-wise recursion relations for the tensors in the finite-width expansion of network statistics when using orthogonal initializations. This work provides a theoretical explanation for the stability of finite-width nonlinear networks initialized with orthogonal weights. The findings were validated experimentally through numerical solutions and analytical expansions that agreed well with Monte-Carlo estimates. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Provides a theoretical explanation for improved training performance in neural networks, potentially guiding future model architectures.

RANK_REASON This is a research paper published on arXiv detailing theoretical and experimental findings on neural network initialization.

Read on arXiv cs.LG →

paper

COVERAGE [2]

arXiv cs.LG TIER_1 · Max Guillen, Jan E. Gerken · 2026-05-08 04:00

Criticality and Saturation in Orthogonal Neural Networks

arXiv:2605.06563v1 Announce Type: new Abstract: It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, wher…
arXiv cs.LG TIER_1 · Jan E. Gerken · 2026-05-07 16:57

Criticality and Saturation in Orthogonal Neural Networks

It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, where the infinite-width statistics are supplemented…

COVERAGE [2]

Criticality and Saturation in Orthogonal Neural Networks

Criticality and Saturation in Orthogonal Neural Networks

RELATED ENTITIES

RELATED TOPICS