PulseAugur
EN
LIVE 12:11:48
ENTITY WavLM

WavLM

PulseAugur coverage of WavLM — every cluster mentioning WavLM across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
13
13 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
13
13 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/1 · 13 TOTAL
  1. TOOL · CL_104735 ·

    Speech models encode child age/gender in early layers, study finds

    Researchers have analyzed how well self-supervised learning (SSL) models capture age and gender information in children's speech. The study focused on four models: Wav2Vec2, HuBERT, Data2Vec, and WavLM, examining their …

  2. RESEARCH · CL_96194 ·

    New toolkits simplify syllable-level speech tokenization for AI models

    Two new research papers introduce novel toolkits for syllable-level speech tokenization, aiming to improve spoken language modeling. The first, "findsylls," offers a language-agnostic toolkit that unifies various syllab…

  3. TOOL · CL_93486 ·

    WavSLM simplifies speech generation with distilled WavLM representations

    Researchers have developed WavSLM, a novel speech language model that simplifies the generation of coherent speech by distilling self-supervised WavLM representations into a single codebook. This approach allows WavSLM …

  4. TOOL · CL_93454 ·

    New discrete optimal transport attack targets speaker verification systems

    Researchers have developed a novel adversarial attack method using discrete optimal transport (DOT) that targets automatic speaker verification (ASV) and anti-spoofing systems. This black-box attack operates by aligning…

  5. COMMENTARY · CL_81442 ·

    ASR models advance with new architectures and vast supervised data

    The field of Automatic Speech Recognition (ASR) is seeing rapid advancements driven by two primary factors: the increasing availability of pseudo-labeled data and the emergence of new model architectures. While models l…

  6. RESEARCH · CL_82022 ·

    New method explains deepfake speech detector decisions

    Researchers have developed a new method to understand how deepfake speech detectors make their decisions. By using Integrated Gradients on self-supervised representations, the technique can pinpoint specific moments in …

  7. TOOL · CL_80093 ·

    New voice conversion method uses KNN for non-parallel data

    Researchers have developed a novel voice conversion framework that uses K-Nearest Neighbors (KNN) retrieval on WavLM representations to align non-parallel speech data. This method constructs synthetic training pairs fro…

  8. TOOL · CL_29444 ·

    New framework improves speech confidence detection using Whisper

    Researchers have developed a new semi-supervised framework for detecting speaker confidence in speech, addressing the challenge of limited labeled data. This approach combines deep semantic embeddings from OpenAI's Whis…

  9. RESEARCH · CL_22202 ·

    WavCube model unifies speech understanding and generation with compressed representation

    Researchers have developed WavCube, a novel speech representation model designed to unify speech understanding and generation tasks. This model utilizes a compact continuous latent space derived from a self-supervised l…

  10. TOOL · CL_18816 ·

    Phoneme-level analysis improves detection of emotionally manipulated synthetic speech

    Researchers have developed a new method for detecting deepfake audio by analyzing speech at the phoneme level. This approach, which uses self-supervised embeddings, proved more effective than previous methods that treat…

  11. RESEARCH · CL_15484 ·

    Researchers explore quantum and deep learning for audio deepfake detection

    Two research papers submitted to the Environment-Aware Speech and Sound Deepfake Detection Challenge (ESDD2) in 2026 propose novel deep-learning frameworks for detecting manipulated audio. The first paper introduces a d…

  12. RESEARCH · CL_16198 ·

    New GRIDS framework detects anomalies in self-supervised speech models

    Researchers have developed a new framework called GRIDS to analyze how perturbations affect the internal representations of self-supervised speech models. By using Local Intrinsic Dimensionality (LID), the framework can…

  13. RESEARCH · CL_14111 ·

    LASE model improves cross-script voice cloning by making embeddings language-uninformative

    Researchers have developed LASE, a Language-Adversarial Speaker Encoder, to improve multilingual voice cloning. Standard encoders struggle to maintain speaker identity across different scripts, particularly when project…