IGLU: The Integrated Gaussian Linear Unit Activation Function
Researchers have introduced IGLU, a novel parametric activation function for deep neural networks designed to improve gradient flow and optimization stability. Derived from a mixture of GELU gates under a half-normal distribution, IGLU offers a continuous interpolation between identity-like and ReLU-like behavior through a single parameter. Its heavy-tailed Cauchy gate ensures non-zero gradients for all finite inputs, enhancing robustness against vanishing gradients. An efficient approximation, IGLU-Approx, utilizes only ReLU operations, reducing computational cost while maintaining competitive performance across vision and language datasets. AI
IMPACT Introduces a new activation function that may improve training stability and performance in deep learning models.