ENTITY Large Audio-Language Models

Large Audio-Language Models

PulseAugur coverage of Large Audio-Language Models — every cluster mentioning Large Audio-Language Models across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

17 over 90d

Releases · 30d

0 over 90d

Papers · 30d

16 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

10 day(s) with sentiment data

RECENT · PAGE 1/1 · 17 TOTAL

TOOL · CL_119468 · Jul 1 · 04:00

ALM2Vec framework uses large audio-language models for universal audio retrieval

Researchers have introduced ALM2Vec, a novel framework designed to create universal audio embeddings by leveraging large audio-language models (LALMs). Unlike previous methods focused on audio-caption matching, ALM2Vec …
TOOL · CL_109873 · Jun 24 · 04:42

New benchmark evaluates audio LLMs' context-aware scene understanding

Researchers have introduced a new benchmark called CASU (Context-Aware Auditory Scene Understanding) to evaluate Large Audio Language Models (LALMs). Existing benchmarks often assess audio layers like speech or sound in…
TOOL · CL_113484 · Jun 23 · 14:43

New benchmark reveals LALM judges lag human paralinguistic evaluation

Researchers have developed ParaPairAudioBench, a new benchmark designed to evaluate Large Audio-Language Models (LALMs) on their ability to distinguish subtle paralinguistic features in speech. The benchmark includes 5,…
TOOL · CL_98002 · Jun 18 · 04:00

New CoAT Framework Enhances Large Audio Language Models with Continuous Thinking Space

Researchers have developed a new framework called Continuous Audio Thinking (CoAT) designed to enhance the capabilities of Large Audio Language Models (LALMs). CoAT equips these models with a continuous latent workspace…
RESEARCH · CL_96198 · Jun 17 · 04:00

New benchmarks tackle privacy risks in large language models

Researchers have developed new methods to evaluate membership inference attacks (MIAs) against large language models (LLMs), particularly focusing on audio and text modalities. The first study introduces a systematic ev…
RESEARCH · CL_90823 · Jun 12 · 16:09

New AudioDER Dataset Boosts LALM Reasoning Capabilities

Researchers have introduced AudioDER, a new dataset designed to enhance the reasoning capabilities of Large Audio-Language Models (LALMs). The dataset addresses the issue of redundancy in existing audio-language dataset…
TOOL · CL_77280 · Jun 8 · 04:00

SpectCount uses synthetic audio to boost large audio language models

Researchers have developed SpectCount, a novel method for improving large audio language models (LALMs) by using synthetic audio signals. This approach addresses the scarcity of high-quality annotated audio data by gene…
RESEARCH · CL_79160 · Jun 7 · 11:07

New adapter adds test-time memory to audio LLMs for better emotion recognition

Researchers have developed a novel method called Titans-as-a-Layer (MAL) to enhance conversational speech emotion recognition. This plug-and-play adapter integrates test-time neural memory into large audio language mode…
RESEARCH · CL_79146 · Jun 6 · 14:24

New GlobeAudio benchmark tests AI audio models on naturalistic language

Researchers have introduced GlobeAudio, a new benchmark designed to evaluate Large Audio-Language Models (LALMs) in more realistic, naturalistic settings. The benchmark features 5,637 multiple-choice questions in six di…
RESEARCH · CL_70168 · Jun 3 · 00:00

New Audio Interaction Model Unifies Real-Time Audio Tasks

Researchers have introduced the Audio Interaction Model (AIM), a novel Large Audio Language Model (LALM) designed for real-time, interactive audio processing. Unlike previous offline or single-task streaming models, AIM…
TOOL · CL_58813 · May 29 · 04:00

EvA Architecture Enhances Audio Understanding in Large Language Models

Researchers have introduced EvA (Evidence-First Audio), a novel dual-path architecture designed to improve the performance of Large Audio Language Models (LALMs). EvA addresses the 'evidence bottleneck' by enhancing the…
TOOL · CL_58710 · May 29 · 04:00

New benchmark and method improve temporal grounding in music LLMs

Researchers have introduced MusTBENCH, a new benchmark designed to evaluate the temporal grounding capabilities of Large Audio-Language Models (LALMs) in music understanding. Existing LALMs often struggle to accurately …
RESEARCH · CL_58559 · May 28 · 14:53

New research reveals escalating LLM and LALM jailbreak vulnerabilities

Three new research papers explore the vulnerabilities and defenses of large language models (LLMs) and large audio-language models (LALMs). The first paper details a taxonomy of audio jailbreak attacks and defenses, hig…
TOOL · CL_56372 · May 28 · 04:00

New Protocol Assesses Factual Music Comprehension in Audio LLMs

Researchers have developed a new protocol to accurately assess the factual music comprehension of large audio language models (LALMs). The existing MusicQA dataset was found to be insufficient for measuring the factual …
RESEARCH · CL_36822 · May 17 · 13:00

Hidden audio attacks compromise AI voice systems

New research reveals that AI voice systems, including large audio-language models (LALMs), are susceptible to hidden audio attacks. These attacks embed imperceptible sounds into audio clips, allowing malicious actors to…
RESEARCH · CL_06671 · Apr 28 · 04:00

HeadRouter prunes audio tokens in LLMs by routing attention heads

Researchers have introduced HeadRouter, a novel method for compressing large audio language models by dynamically pruning audio tokens. Unlike previous approaches that assume uniform head importance, HeadRouter recogniz…
RESEARCH · CL_06271 · Apr 27 · 12:25

Audio-language models often answer questions without audio, challenging evaluation methods.

New research indicates that Large Audio-Language Models (LALMs) may not possess true auditory perception despite high benchmark scores. Studies reveal that these models can answer questions using only text and general k…

ALM2Vec framework uses large audio-language models for universal audio retrieval

New benchmark evaluates audio LLMs' context-aware scene understanding

New benchmark reveals LALM judges lag human paralinguistic evaluation

New CoAT Framework Enhances Large Audio Language Models with Continuous Thinking Space

New benchmarks tackle privacy risks in large language models

New AudioDER Dataset Boosts LALM Reasoning Capabilities

SpectCount uses synthetic audio to boost large audio language models

New adapter adds test-time memory to audio LLMs for better emotion recognition

New GlobeAudio benchmark tests AI audio models on naturalistic language

New Audio Interaction Model Unifies Real-Time Audio Tasks

EvA Architecture Enhances Audio Understanding in Large Language Models

New benchmark and method improve temporal grounding in music LLMs

New research reveals escalating LLM and LALM jailbreak vulnerabilities

New Protocol Assesses Factual Music Comprehension in Audio LLMs

Hidden audio attacks compromise AI voice systems

HeadRouter prunes audio tokens in LLMs by routing attention heads

Audio-language models often answer questions without audio, challenging evaluation methods.