PulseAugur
实时 19:06:18
English(EN) Divide, Deliberate, Decide: A Multi-Agent Framework for Fine-Grained Egocentric Action Recognition

新的多智能体框架改进了自我中心动作识别

研究人员推出了一种名为“Divide, Deliberate, Decide”的新型多智能体框架,旨在增强自我中心视频中的细粒度动作识别。该零样本系统利用VLM协调器来分割视频并提出候选动作,随后进入审议阶段,异构VLM专家相互咨询。该框架聚合智能体排名以改进预测,而无需任何微调,通过利用去相关模型先验,展示了优于基线方法的性能。 AI

影响 该框架可以通过利用协作式AI智能体来提高AI系统理解复杂视觉数据的准确性。

排序理由 该集群描述了一篇发表在arXiv上的新研究论文,详细介绍了一个用于动作识别的新框架。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Alessandro Sottovia, Alessandro Torcinovich, Oswald Lanz ·

    Divide, Deliberate, Decide: A Multi-Agent Framework for Fine-Grained Egocentric Action Recognition

    arXiv:2606.17627v1 Announce Type: cross Abstract: Fine-grained action recognition in egocentric video is challenging for Vision-Language Models (VLMs): actions often differ only in small visual cues, and a single model tends to be biased toward a subset of these cues. We propose …

  2. arXiv cs.CV TIER_1 English(EN) · Oswald Lanz ·

    Divide, Deliberate, Decide: A Multi-Agent Framework for Fine-Grained Egocentric Action Recognition

    Fine-grained action recognition in egocentric video is challenging for Vision-Language Models (VLMs): actions often differ only in small visual cues, and a single model tends to be biased toward a subset of these cues. We propose Divide, Deliberate, Decide, a fully-local, zero-sh…