Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 8h

IGLU: The Integrated Gaussian Linear Unit Activation Function

Researchers have introduced IGLU, a novel parametric activation function for deep neural networks designed to improve gradient flow and optimization stability. Derived from a mixture of GELU gates under a half-normal distribution, IGLU offers a continuous interpolation between identity-like and ReLU-like behavior through a single parameter. Its heavy-tailed Cauchy gate ensures non-zero gradients for all finite inputs, enhancing robustness against vanishing gradients. An efficient approximation, IGLU-Approx, utilizes only ReLU operations, reducing computational cost while maintaining competitive performance across vision and language datasets. AI

IMPACT Introduces a new activation function that may improve training stability and performance in deep learning models.

CIFAR-10
CIFAR-100
ReLU
GELU
WikiText-103
GPT-2 Small
ViT-Tiny
ResNet-20
Mingi Kang