On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation
Researchers have investigated the impact of low-bit quantization on speaker verification systems, finding that performance degradation is not solely due to weight distortion. They identified a critical point at 2-bit quantization where score errors and decision flips become significant, particularly near the floating-point threshold. To address this, a calibrated multi-precision cascade approach was proposed, which uses 2-bit quantization for most trials while escalating ambiguous cases, thereby maintaining near FP32 performance with reduced computational and memory costs. AI