BaldWhisper model achieves 48% size reduction and 2.15x speedup

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 04:00

Researchers have developed BaldWhisper, a method to significantly compress and accelerate the Whisper speech-to-text model. By employing low-rank decomposition for embeddings and merging transformer layers, BaldWhisper achieves a 48% reduction in model size and a 2.15x speed increase on a MacBook Air M1. This approach maintains 90% of the original performance, even in data-scarce scenarios like the Bambara language with only 32 hours of training data. AI

影响 Offers a path to deploy powerful speech-to-text models on edge devices with limited data.

排序理由 This is a research paper detailing a new method for model compression and acceleration. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Yaya Sy, Christophe Cerisara, Irina Illina · 2026-05-05 04:00

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging

arXiv:2510.08599v2 Announce Type: replace-cross Abstract: Pruning large pre-trained transformers in a data-scarce scenario is challenging, as it often requires massive retraining data to recover performance. For instance, Distill-Whisper prunes Whisper by 40 and retrains on 21,00…

报道来源 [1]

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging

相关实体

相关话题