Automatic Speech Recognition
PulseAugur coverage of Automatic Speech Recognition — every cluster mentioning Automatic Speech Recognition across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
ASR fine-tuned for Indian banking calls after 3-week effort
This article details the process of fine-tuning an Automatic Speech Recognition (ASR) system specifically for the unique challenges of Indian banking calls. The author spent three weeks experimenting with multiple model…
-
LLMs generate synthetic conversations to boost ASR training
Researchers have developed a novel method to enhance Automatic Speech Recognition (ASR) training for low-resource languages by generating synthetic conversational data. This pipeline uses LLMs to create dialogues, maps …
-
New ASR methods tackle compute scaling and multilingual evaluation
Researchers are developing new methods to improve automatic speech recognition (ASR) systems. One approach, LARM, uses a depth-conditioned looped Transformer to allow for adjustable test-time computation, achieving perf…
-
Noisekit CLI generates realistic degraded audio for ASR benchmarking
A new command-line tool called noisekit has been released to help benchmark automatic speech recognition (ASR) systems. It generates realistic degraded audio datasets by applying various noise and distortion conditions …
-
Intel NPU accelerates smart home ASR, outperforming CPU on speed and energy
A user has successfully utilized their Intel Arrow Lake NPU for Automatic Speech Recognition (ASR) in a smart home setup, achieving significant performance gains. The NPU processed a 10-second audio clip 4.8 times faste…
-
AI voice assistants in 2026 offer advanced capabilities for personal and business use
AI voice assistants in 2026 are significantly more advanced, leveraging LLMs, ASR, ML, and NLP to understand natural speech, learn continuously, and personalize responses. These assistants are categorized into personal …
-
New neural layer nASR enhances EEG artifact removal for BCIs
Researchers have developed nASR, a novel trainable neural layer designed to improve Electroencephalogram (EEG) signal processing for Brain-Computer Interfaces (BCIs). This new layer addresses limitations in existing Art…
-
Voice AI paradox: Advanced chat, basic failures
Voice AI assistants like Yandex's Alisa exhibit a paradox of advanced conversational abilities alongside basic functional failures, stemming from their complex architecture. This hybrid system combines speech recognitio…
-
Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency
Sakana AI has developed KAME, a novel tandem architecture for speech-to-speech AI that aims to combine the speed of direct systems with the knowledge depth of LLM-based approaches. KAME operates with two asynchronous co…
-
Tamazight single-speaker speech dataset released on Hugging Face
A new single-speaker speech dataset for the Tamazight language has been released on Hugging Face and the Mozilla Data Collective. This dataset is intended for use in AI applications such as automatic speech recognition …
-
Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis
Researchers have developed a novel data augmentation technique to improve automatic speech recognition (ASR) for elderly individuals. This method utilizes large language models to paraphrase existing transcripts, genera…
-
New LLMs unify audio and language processing for full-duplex and medical applications
Researchers have developed UAF, a novel unified audio front-end LLM designed for full-duplex speech interaction. This model integrates diverse audio front-end tasks like voice activity detection and turn-taking into a s…
-
MedSpeak framework improves medical QA by correcting ASR errors with knowledge graphs
Researchers have developed MedSpeak, a new framework designed to improve the accuracy of spoken question-answering systems in the medical domain. This system utilizes a medical knowledge graph to aid automatic speech re…
-
New framework identifies demographic unfairness in speech recognition models
A new research paper identifies two types of errors—random variance and systematic bias—that contribute to demographic unfairness in speech recognition models. The study found that while both error types are present, ra…
-
"This Wasn't Made for Me": ASR Bias Hurts Users Emotionally and Cognitively
A new research paper highlights the emotional and psychological toll of bias in Automatic Speech Recognition (ASR) systems. The study, which involved user experience research in four U.S. locations, found that participa…