Brief · PulseAugur

RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Beyond $\ell_2$-norm and $\ell_\infty$-norm: A Curvature-Inspired $\ell_p$-Norm Scheme for Deep Neural Networks

Researchers have introduced a new optimization scheme for deep neural networks that utilizes a dynamic $\ell_p$-norm, moving beyond the limitations of fixed $\ell_2$ and $\ell_\infty$ norms. This novel approach, termed LPSGD and LPSGDM, aims to improve convergence and generalization by adapting the norm's parameter $p$ throughout the training process. The method begins with a large $p$ to manage high-curvature directions and gradually decreases $p$ towards 2 for more stable updates, theoretically achieving an $O(T^{-1/2})$ convergence rate for non-convex problems. AI

IMPACT Introduces a novel optimization technique that could improve training efficiency and generalization for deep learning models.

CIFAR-10
SGD
CIFAR-100
ResNet-50
ResNet-18
ImageNet-1K
Deep Neural Networks
VGG-11
SGD with momentum
LPSGDM
LPSGD