A Fully First-Order Layer for Differentiable Optimization
Researchers are exploring novel methods for optimizing neural networks without relying on traditional gradient-based approaches. One paper introduces a first-order layer for differentiable optimization that avoids computationally intensive Hessian calculations by reformulating the problem as a bilevel optimization task. Another study proposes a gradient-free method for infinite-dimensional optimization in Hilbert spaces, utilizing directional derivatives and automatic differentiation, which has shown promise in solving differential equations via physics-informed neural networks. A practical demonstration on the MNIST dataset successfully employed a derivative-free optimization method to achieve competitive accuracy in image classification, outperforming a baseline Adam optimizer in a high-dimensional parameter space. AI
IMPACT These gradient-free optimization techniques could offer new avenues for training complex models, potentially reducing computational costs and enabling optimization in scenarios where gradients are difficult to compute.