ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training
Researchers have developed ReQAT, a novel training framework designed to enable Large Reasoning Models (LRMs) to achieve full-precision reasoning accuracy even when quantized to 4-bit floating-point formats. Existing quantization methods struggle with low-entropy tokens like digits and operators, leading to reasoning degradation. ReQAT addresses this through Trace-Aligned QAT, Selective Entropy Minimization, and Q-FIT initialization, which collectively focus on critical decisions and stabilize training. This approach not only recovers but surpasses standard fine-tuning accuracy while significantly improving inference speed and reducing hardware requirements. AI
IMPACT Enables more efficient deployment of large reasoning models, potentially reducing hardware costs and increasing inference speeds.