Audio Language Models
PulseAugur coverage of Audio Language Models — every cluster mentioning Audio Language Models across labs, papers, and developer communities, ranked by signal.
5 day(s) with sentiment data
-
New framework enhances in-context learning for clinical audio diagnosis
Researchers have developed a new framework called Federated Self-Contextualization (FSC) designed to improve in-context learning for audio-language models in clinical settings, particularly in low-resource environments.…
-
New framework enhances audio language models with trainable audio prompts
Researchers have developed a new framework for Audio Language Models (ALMs) that introduces trainable prompts directly into the audio encoder. This approach aims to capture task-specific acoustic features, enhancing few…
-
Audio language models improve speech emotion recognition with acoustic cues
Researchers have developed a method to improve speech emotion recognition in audio language models by incorporating explicit acoustic cues. By deriving six interpretable acoustic concept tokens from paralinguistic featu…
-
Audio-language models override clear audio with conflicting text
Researchers have identified a significant issue in audio-language models where conflicting text inputs override clear audio evidence, leading to incorrect outputs. A new study reveals that in 64.1% of conflict cases acr…
-
New tools enhance audio deepfake detection and analysis
Researchers have developed new tools and methods to combat audio deepfakes. AUDDT is an open-source toolkit designed to evaluate the generalization capabilities of deepfake detectors across a wide array of audio dataset…
-
New PitchBench Benchmark Reveals Unreliable Pitch Hearing in Audio-Language Models
Researchers have developed PitchBench, a new evaluation suite designed to systematically measure the pitch perception abilities of audio-language models (ALMs). The suite includes 28 experiments that test both absolute …
-
Researchers warn AI voice assistants vulnerable to hidden audio commands
Researchers have identified a significant security vulnerability in AI voice assistants and audio-language models. These systems, increasingly used as everyday interfaces, can be manipulated through imperceptible audio …
-
New architecture boosts audio language models' attention to salient sounds
Researchers have developed NAACA, a novel architecture designed to improve how audio language models process long audio recordings. NAACA uses a training-free approach with an Oscillatory Working Memory (OWM) to filter …
-
New AI method automates coding of therapy sessions
Researchers have developed a new method for automatically coding Motivational Interviewing (MI) sessions using audio-language models (ALMs). This approach analyzes both spoken words and acoustic cues, integrating predic…