English(EN) EvoGround: Self-Evolving Video Agents for Video Temporal Grounding

EvoGround 使用自演化代理进行视频时间定位

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-13 17:25

研究人员开发了 EvoGround，一个利用两个自演化代理在没有人工标注数据的情况下执行视频时间定位的新颖框架。该系统包括一个生成器代理，从原始视频中生成查询-时刻对，以及一个定位这些对的求解器代理，并提供反馈以增强生成器。这种自增强循环使代理能够相互改进，在 VTG 基准测试上取得最先进的结果，甚至可以作为细粒度视频字幕生成器。 AI

影响引入了一种新颖的视频分析方法，无需大量手动标注，有望加速视频理解领域的研究和应用开发。

排序理由该集群包含一篇详细介绍视频时间定位新方法的论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Lorenzo Torresani · 2026-05-13 17:25

EvoGround：用于视频时序定位的自演化视频代理

Video temporal grounding (VTG) takes an untrimmed video and a natural-language query as input and localizes the temporal moment that best matches the query. Existing methods rely on large, task-specific datasets requiring costly manual annotation. We introduce EvoGround, a framew…

报道来源 [1]

EvoGround：用于视频时序定位的自演化视频代理

相关实体

相关话题