Beyond $\ell_2$-norm and $\ell_\infty$-norm: A Curvature-Inspired $\ell_p$-Norm Scheme for Deep Neural Networks
Researchers have introduced a new optimization scheme for deep neural networks that utilizes a dynamic $\ell_p$-norm, moving beyond the limitations of fixed $\ell_2$ and $\ell_\infty$ norms. This novel approach, termed LPSGD and LPSGDM, aims to improve convergence and generalization by adapting the norm's parameter $p$ throughout the training process. The method begins with a large $p$ to manage high-curvature directions and gradually decreases $p$ towards 2 for more stable updates, theoretically achieving an $O(T^{-1/2})$ convergence rate for non-convex problems. AI
IMPACT Introduces a novel optimization technique that could improve training efficiency and generalization for deep learning models.