Hubert
PulseAugur coverage of Hubert — every cluster mentioning Hubert across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
Speech models encode child age/gender in early layers, study finds
Researchers have analyzed how well self-supervised learning (SSL) models capture age and gender information in children's speech. The study focused on four models: Wav2Vec2, HuBERT, Data2Vec, and WavLM, examining their …
-
Transformer models show improved accuracy for Quranic ASR
Researchers have conducted a comparative study on pretrained Transformer models for Quranic Automatic Speech Recognition (ASR), aiming to reduce high Word Error Rates (WER) on user-recited verses. The study fine-tuned m…
-
New LM-SPT method enhances speech tokenization for better language model alignment
Researchers have developed LM-SPT, a novel method for speech tokenization that aims to improve the alignment between speech and language models. Unlike previous approaches that directly distill features or use pooling, …
-
New dataset enhances AI detection of deepfake audio with linguistic cues
Researchers have introduced Linguistically Augmented Audio Speech Data (LinguAS), a new dataset designed to combat the rise of deepfaked audio. LinguAS includes over 800 audio samples, both genuine and fake, annotated w…
-
Speech models generalize to recognize rare click consonants
Researchers investigated whether self-supervised speech models can accurately recognize uncommon speech sounds, specifically click consonants found in Khoisan languages. By fine-tuning models like Wav2Vec2 and HuBERT on…
-
AI model detects Parkinson's disease using multi-modal speech analysis
Researchers have developed a novel multi-branch deep learning framework designed to improve the detection of Parkinson's disease through speech analysis. This approach utilizes three distinct speech representations: Log…
-
GeMCL algorithm scales few-shot spoken word classification
Researchers have developed a new method called Generative Meta-Continual Learning (GeMCL) to improve few-shot spoken word classification. This approach allows a model to sequentially learn to distinguish between 1000 cl…
-
New NLP Models Tackle Dementia Detection in Filipino Speech
Researchers have developed a new approach to dementia detection using natural language processing, focusing on low-resource languages like Filipino. They created a bilingual dataset and evaluated several transformer mod…
-
Generative meta-learning shows minimal language impact on spoken word classification
Researchers have explored the effectiveness of generative meta-continual learning for spoken word classification across multiple languages. Their findings indicate that while multilingual models perform best, the perfor…
-
New framework improves speech confidence detection using Whisper
Researchers have developed a new semi-supervised framework for detecting speaker confidence in speech, addressing the challenge of limited labeled data. This approach combines deep semantic embeddings from OpenAI's Whis…
-
New framework analyzes concept representations in neural models
Researchers have developed a new framework to analyze how neural models represent human-interpretable concepts. This framework uses axes of containment and disentanglement to study concept subspaces within models. Exper…
-
AI models trained on birdsong classify elephant calls with high accuracy
Researchers have demonstrated that pre-trained acoustic embeddings can effectively classify elephant vocalizations without requiring fine-tuning. This approach is particularly valuable given the scarcity and cost of ann…
-
Speech-FT framework merges pre-trained and fine-tuned models for better generalization
Researchers have developed Speech-FT, a novel two-stage fine-tuning framework designed to improve speech representation models. This method aims to enhance performance on specific tasks without sacrificing the model's a…
-
New AI method stably characterizes dysarthria across languages and causes
Researchers have developed a novel, training-free method to assess dysarthria severity using self-supervised speech representations. This approach analyzes phonological feature subspaces across 3,374 speakers in 12 lang…