Quantization study enables smaller, more accurate Whisper-small ASR

By PulseAugur Editorial · [1 sources] · 2026-05-22 04:00

A new study published on arXiv evaluates various post-training quantization (PTQ) techniques for the Whisper-small automatic speech recognition model. The research, which tested libraries like PyTorch, Optimum-Quanto, HQQ, and bitsandbytes, found that dynamic int8 quantization using Quanto provided the best balance of compression and accuracy. This method reduced model size by 57% while slightly improving word error rates on the LibriSpeech dataset, making Whisper-small more deployable on resource-constrained devices. AI

IMPACT Enables more efficient deployment of speech recognition models on edge devices by reducing size and computational cost.

RANK_REASON The cluster contains an academic paper detailing research into model optimization techniques. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Arthur S\"ohler, Julian Irigoyen, Andreas S{\o}eborg Kirkedal · 2026-05-22 04:00

Quantizing Whisper-small: How design choices affect ASR performance

arXiv:2511.08093v2 Announce Type: replace-cross Abstract: Large speech recognition models like Whisper-small achieve high accuracy but are difficult to deploy on edge devices due to their high computational demand. To this end, we present a unified, cross-library evaluation of po…

COVERAGE [1]

Quantizing Whisper-small: How design choices affect ASR performance

RELATED ENTITIES

RELATED TOPICS