Whisper Large V3
PulseAugur coverage of Whisper Large V3 — every cluster mentioning Whisper Large V3 across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
LibriConvo corpus advances ASR and speaker diarization
Researchers have developed LibriConvo, a new synthetic conversational speech corpus designed to improve automatic speech recognition (ASR) and speaker diarization systems. The corpus was created by adapting the Speaker-…
-
Speech models compressed using parameter clustering
Researchers have developed a new method for compressing speech foundation models without requiring additional data or retraining. This approach utilizes channelwise clustering with k-means to achieve parameter compressi…
-
Whisfusion uses masked diffusion for faster, more accurate speech recognition
Researchers have developed Whisfusion, a novel non-autoregressive system for automatic speech recognition (ASR) that utilizes masked diffusion models. This approach aims to match the accuracy of traditional autoregressi…
-
ASR models advance with new architectures and vast supervised data
The field of Automatic Speech Recognition (ASR) is seeing rapid advancements driven by two primary factors: the increasing availability of pseudo-labeled data and the emergence of new model architectures. While models l…
-
Together AI builds world's fastest speech-to-text stack
Together AI has developed a highly efficient speech-to-text system, significantly outperforming existing models in speed. Their approach addresses the unique challenges of audio data processing, which is substantially l…
-
New benchmark PashtoTTS-Bench evaluates low-resource text-to-speech systems
A new benchmark, PashtoTTS-Bench, has been developed to evaluate text-to-speech systems for low-resource languages like Pashto, addressing limitations of traditional round-trip ASR methods. The benchmark introduces the …
-
Voice AI Stack Matures: Top STT, TTS, and Orchestration Platforms for Production
A May 2026 analysis of voice AI technologies reveals significant advancements across Speech-to-Text (STT), Text-to-Speech (TTS), and orchestration platforms, making voice agents a viable engineering problem for producti…
-
AI flywheel boosts Indic ASR accuracy by 17x for niche entities
Researchers have developed a novel Text-to-Speech (TTS) and Speech-to-Text (STT) system, dubbed the "TTS-STT Flywheel," to improve Automatic Speech Recognition (ASR) for niche domains in Indic languages. This system syn…
-
Moonshine Voice releases open-source STT toolkit with on-device processing
Moonshine Voice has released an open-source AI toolkit designed for developers building real-time voice applications. The framework and its speech-to-text models are optimized for low latency and run entirely on-device,…