English(EN) NAACA: Training-Free NeuroAuditory Attentive Cognitive Architecture with Oscillatory Working Memory for Salience-Driven Attention Gating

新架构提升音频语言模型对显著声音的注意力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-13 15:09

研究人员开发了NAACA，这是一种旨在改进音频语言模型处理长音频录音方式的新型架构。NAACA采用无训练方法，并结合振荡工作记忆（OWM）来过滤显著的听觉事件，减少不必要的处理。该方法在暴力检测等任务上显著提高了性能，在XD-Violence数据集上的平均精度从53.50%提高到70.60%。 AI

影响通过将注意力集中在关键声音上，增强了语言模型中的音频处理能力，有望改进监控和环境监测等应用。

排序理由发表了一篇详细介绍新AI架构及其在特定数据集上性能的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Dick Botteldooren · 2026-05-13 15:09

NAACA: Training-Free NeuroAuditory Attentive Cognitive Architecture with Oscillatory Working Memory for Salience-Driven Attention Gating

Audio provides critical situational cues, yet current Audio Language Models (ALMs) face an attention bottleneck in long-form recordings where dominant background patterns can dilute rare, salient events. We introduce NAACA, a training-free NeuroAuditory Attentive Cognitive Archit…

报道来源 [1]

NAACA: Training-Free NeuroAuditory Attentive Cognitive Architecture with Oscillatory Working Memory for Salience-Driven Attention Gating

相关实体

相关话题