English(EN) EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness

新研究解决多模态情感识别中的冲突数据问题

作者 PulseAugur 编辑部 · [3 个来源] · 2026-05-05 04:00

研究人员开发了新的方法来改进多模态情感识别，该技术结合了文本、音频和视觉数据。一种方法，双路径冲突解决（DCR），学习融合冲突模态或完全丢弃它们，在多个基准测试上优于现有基线。另一种方法，EmoMM，引入了一个基准测试和一种称为冲突感知头部注意力引导（CHASE）的技术，以解决多模态大语言模型中的视频贡献崩溃等问题，从而提高其在复杂情感场景中的可靠性。 AI

影响多模态情感识别的进步可能导致人工智能更细致地理解复杂现实世界场景中的人类互动和情感。

排序理由两篇研究论文介绍了多模态情感识别的新颖方法和基准测试，解决了模态冲突和数据缺失等挑战。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.LG TIER_1 English(EN) · Yangchen Yu, Qian Chen, Jia Li, Zhenzhen Hu, Jinpeng Hu, Lizi Liao, Erik Cambria, Richang Hong · 2026-05-07 04:00

To Fuse or to Drop? Dual-Path Learning for Resolving Modality Conflicts in Multimodal Emotion Recognition

arXiv:2605.04877v1 Announce Type: cross Abstract: Multimodal emotion recognition (MER) benefits from combining text, audio, and vision, yet standard fusion often fails when modalities conflict. Crucially, conflicts differ in resolvability: benign conflicts stem from missing, weak…
arXiv cs.LG TIER_1 English(EN) · Richang Hong · 2026-05-06 13:11

To Fuse or to Drop? Dual-Path Learning for Resolving Modality Conflicts in Multimodal Emotion Recognition

Multimodal emotion recognition (MER) benefits from combining text, audio, and vision, yet standard fusion often fails when modalities conflict. Crucially, conflicts differ in resolvability: benign conflicts stem from missing, weak, or ambiguous cues and can be mitigated by cross-…
arXiv cs.CV TIER_1 English(EN) · Yueru Sun, Yimeng Zhang, Haoyu Gu, Nuo Chen, Dong She, Xianrong Yao, Yang Gao, Zhanpeng Jin · 2026-05-05 04:00

EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness

arXiv:2605.01024v1 Announce Type: new Abstract: Multimodal Emotion Recognition (MER) is critical for interpreting real-world interactions. While Multimodal Large Language Models (MLLM) have shown promise in MER, their internal decision-making mechanisms under modality conflict an…

报道来源 [3]

To Fuse or to Drop? Dual-Path Learning for Resolving Modality Conflicts in Multimodal Emotion Recognition

To Fuse or to Drop? Dual-Path Learning for Resolving Modality Conflicts in Multimodal Emotion Recognition

EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness

相关实体

相关话题