Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
Researchers have introduced Quantized Evolution Strategies (QES), a novel optimization paradigm designed for fine-tuning quantized large language models (LLMs) directly within their discrete parameter space. This method addresses the limitations of traditional fine-tuning techniques, which rely on continuous weights and backpropagation, making them unsuitable for quantized models. QES incorporates accumulated error feedback for precise weight updates and uses stateless seed replay to minimize memory usage, enabling fine-tuning at low-precision inference costs. The approach demonstrates superior performance compared to existing zeroth-order fine-tuning methods, paving the way for scaling LLMs entirely within the quantized domain. AI
IMPACT Enables more efficient deployment and fine-tuning of large language models on memory-constrained devices.