ENTITY Automatic Speech Recognition

Automatic Speech Recognition

PulseAugur coverage of Automatic Speech Recognition — every cluster mentioning Automatic Speech Recognition across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

25 over 90d

Releases · 30d

0 over 90d

Papers · 30d

18 over 90d

TIER MIX · 90D

significant 1
research 10
tool 12
commentary 2

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/2 · 25 TOTAL

TOOL · CL_132412 · Jul 8 · 15:01

Voice AI platforms offer businesses a solution to customer service challenges

Voice AI platforms are emerging as a practical solution for businesses struggling with customer service wait times and high contact center turnover. These platforms utilize Automatic Speech Recognition (ASR), Natural La…
RESEARCH · CL_117645 · Jun 30 · 04:00

New research tackles LLM alignment, safety, and optimization challenges

Researchers are exploring new methods to improve the alignment and reliability of large language models (LLMs). One study identifies a vulnerability in byte-pair encoding (BPE) tokenization that can be exploited to bypa…
RESEARCH · CL_117603 · Jun 30 · 04:00

New research advances ASR for dysarthric speech and synthetic data use · 4 sources tracked

Researchers are exploring new methods to improve automatic speech recognition (ASR) systems. One study details how fine-tuning the Whisper model with personalized data significantly reduced word error rates for dysarthr…
TOOL · CL_100168 · Jun 19 · 04:00

Korean spoken QA research highlights ASR error impact on LLMs

A new research paper analyzes how errors in Korean speech recognition impact the performance of large language models (LLMs) in spoken question answering (SQA). The study found that the degradation caused by speech reco…
RESEARCH · CL_86652 · Jun 11 · 17:41

Speech representations impact 3D facial animation quality

Researchers have explored how different speech representations impact the quality of 3D facial animation. The study compared four families of speech representations, evaluating their effectiveness with two facial decode…
TOOL · CL_81713 · Jun 9 · 21:34

Open-source tools and ASR benchmarks advance local AI capabilities

This week's AI news highlights advancements in Automatic Speech Recognition (ASR) for bilingual voice agents and introduces two key open-source computer vision tools. The ASR focus is on benchmarking frontier models for…
TOOL · CL_78238 · Jun 8 · 14:15

ASR fine-tuned for Indian banking calls after 3-week effort

This article details the process of fine-tuning an Automatic Speech Recognition (ASR) system specifically for the unique challenges of Indian banking calls. The author spent three weeks experimenting with multiple model…
RESEARCH · CL_68139 · Jun 2 · 17:46

LLMs generate synthetic conversations to boost ASR training

Researchers have developed a novel method to enhance Automatic Speech Recognition (ASR) training for low-resource languages by generating synthetic conversational data. This pipeline uses LLMs to create dialogues, maps …
RESEARCH · CL_65569 · Jun 1 · 17:49

New ASR methods tackle compute scaling and multilingual evaluation

Researchers are developing new methods to improve automatic speech recognition (ASR) systems. One approach, LARM, uses a depth-conditioned looped Transformer to allow for adjustable test-time computation, achieving perf…
TOOL · CL_58633 · May 29 · 04:00

New Agentic ASR framework mimics human interaction for speech recognition

Researchers have introduced "Agentic ASR," a novel framework designed to improve automatic speech recognition (ASR) by mimicking human-like interactive correction. Unlike traditional single-pass systems, Agentic ASR ope…
TOOL · CL_56296 · May 28 · 04:00

New TARQ technique boosts ASR accuracy for rare words

Researchers have developed a new post-training quantization technique called TARQ, designed to improve the accuracy of Automatic Speech Recognition (ASR) systems, particularly for rare words. TARQ addresses a limitation…
SIGNIFICANT · CL_55755 · May 28 · 03:04

Alibaba AI voice model ranks 5th globally, leads China on Speech Arena

Alibaba's new AI voice model, Fun-Realtime-TTS-Preview, has achieved a top global ranking on the Speech Arena benchmark, securing fifth place worldwide and first place in China. The model demonstrated strong performance…
TOOL · CL_54716 · May 27 · 13:06

Noisekit CLI generates realistic degraded audio for ASR benchmarking

A new command-line tool called noisekit has been released to help benchmark automatic speech recognition (ASR) systems. It generates realistic degraded audio datasets by applying various noise and distortion conditions …
TOOL · CL_51864 · May 26 · 07:23

Intel NPU accelerates smart home ASR, outperforming CPU on speed and energy

A user has successfully utilized their Intel Arrow Lake NPU for Automatic Speech Recognition (ASR) in a smart home setup, achieving significant performance gains. The NPU processed a 10-second audio clip 4.8 times faste…
RESEARCH · CL_51265 · May 25 · 03:57

New method enhances spoken dialogue systems by diagnosing ASR-LLM errors

Researchers have developed a novel approach to improve spoken dialogue systems by addressing error propagation in cascaded Automatic Speech Recognition (ASR) and Large Language Model (LLM) pipelines. This new method use…
COMMENTARY · CL_47605 · May 25 · 03:00

AI voice assistants in 2026 offer advanced capabilities for personal and business use

AI voice assistants in 2026 are significantly more advanced, leveraging LLMs, ASR, ML, and NLP to understand natural speech, learn continuously, and personalize responses. These assistants are categorized into personal …
TOOL · CL_32731 · May 14 · 15:15

New neural layer nASR enhances EEG artifact removal for BCIs

Researchers have developed nASR, a novel trainable neural layer designed to improve Electroencephalogram (EEG) signal processing for Brain-Computer Interfaces (BCIs). This new layer addresses limitations in existing Art…
COMMENTARY · CL_23142 · May 8 · 14:27

Voice AI paradox: Advanced chat, basic failures

Voice AI assistants like Yandex's Alisa exhibit a paradox of advanced conversational abilities alongside basic functional failures, stemming from their complex architecture. This hybrid system combines speech recognitio…
RESEARCH · CL_13577 · May 3 · 07:47

Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency

Sakana AI has developed KAME, a novel tandem architecture for speech-to-speech AI that aims to combine the speed of direct systems with the knowledge depth of LLM-based approaches. KAME operates with two asynchronous co…
RESEARCH · CL_09296 · Apr 29 · 16:36

Tamazight single-speaker speech dataset released on Hugging Face

A new single-speaker speech dataset for the Tamazight language has been released on Hugging Face and the Mozilla Data Collective. This dataset is intended for use in AI applications such as automatic speech recognition …

Voice AI platforms offer businesses a solution to customer service challenges

New research tackles LLM alignment, safety, and optimization challenges

New research advances ASR for dysarthric speech and synthetic data use · 4 sources tracked

Korean spoken QA research highlights ASR error impact on LLMs

Speech representations impact 3D facial animation quality

Open-source tools and ASR benchmarks advance local AI capabilities

ASR fine-tuned for Indian banking calls after 3-week effort

LLMs generate synthetic conversations to boost ASR training

New ASR methods tackle compute scaling and multilingual evaluation

New Agentic ASR framework mimics human interaction for speech recognition

New TARQ technique boosts ASR accuracy for rare words

Alibaba AI voice model ranks 5th globally, leads China on Speech Arena

Noisekit CLI generates realistic degraded audio for ASR benchmarking

Intel NPU accelerates smart home ASR, outperforming CPU on speed and energy

New method enhances spoken dialogue systems by diagnosing ASR-LLM errors

AI voice assistants in 2026 offer advanced capabilities for personal and business use

New neural layer nASR enhances EEG artifact removal for BCIs

Voice AI paradox: Advanced chat, basic failures

Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency

Tamazight single-speaker speech dataset released on Hugging Face