CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
Researchers have introduced CAGE (Curvature-Aware Gradient Estimation), a novel method for quantization-aware training (QAT) that aims to close the accuracy gap between quantized and natively trained models. CAGE enhances the straight-through estimator (STE) by incorporating a curvature-aware correction term, derived from a multi-objective optimization perspective that balances loss minimization with quantization constraints. This approach has demonstrated significant improvements in accuracy, halving the compression accuracy loss in fine-tuning scenarios and achieving 3-bit quantization accuracy comparable to prior 4-bit methods when applied to Llama models. AI
IMPACT This new QAT method could enable more efficient deployment of large AI models on resource-constrained hardware by reducing model size with minimal accuracy loss.