Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 4d

Quantizing Whisper-small: How design choices affect ASR performance

A new study published on arXiv evaluates various post-training quantization (PTQ) techniques for the Whisper-small automatic speech recognition model. The research, which tested libraries like PyTorch, Optimum-Quanto, HQQ, and bitsandbytes, found that dynamic int8 quantization using Quanto provided the best balance of compression and accuracy. This method reduced model size by 57% while slightly improving word error rates on the LibriSpeech dataset, making Whisper-small more deployable on resource-constrained devices. AI

IMPACT Enables more efficient deployment of speech recognition models on edge devices by reducing size and computational cost.

arXiv
PyTorch
bitsandbytes
LibriSpeech
HQQ
Andreas Kirkedal
Optimum-Quanto
Whisper-small