English(EN) Retrieving Any Relevant Moments: Benchmark and Models for Generalized Moment Retrieval

新的基准和模型推动视频中通用时刻检索的进展

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-04 14:14

研究人员引入了通用时刻检索（GMR），这是一个视频分析的新框架，它超越了每个查询只有一个匹配时刻的假设。该方法旨在检索所有相关的时域片段，或在没有时刻匹配给定自然语言查询时正确识别出来。为了支持这一点，他们使用足球视频开发了 Soccer-GMR 基准，并提出了两种建模范式：用于现有模型的 GMR 适配器和用于微调多模态大语言模型的 GRPO 奖励。 AI

影响为视频语言理解建立了一个更现实的基准，有可能改进 AI 系统处理和检索视频内容信息的方式。

排序理由这是一篇介绍视频检索任务新基准和模型的学术论文。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Yiming Ding, Siyu Cao, Luyuan Jiao, Yixuan Li, Zitong Wang, Zhiyong Liu, Lu Zhang · 2026-05-05 04:00

检索任何相关时刻：通用时刻检索的基准和模型

arXiv:2605.02623v1 Announce Type: new Abstract: Video Moment Retrieval (VMR) aims to localize temporal segments in videos that correspond to a natural language query, but typically assumes only a single matching moment for each query. This assumption does not always hold in real-…
arXiv cs.CV TIER_1 English(EN) · Lu Zhang · 2026-05-04 14:14

检索任何相关时刻：通用时刻检索的基准和模型

Video Moment Retrieval (VMR) aims to localize temporal segments in videos that correspond to a natural language query, but typically assumes only a single matching moment for each query. This assumption does not always hold in real-world scenarios, where queries may correspond to…

报道来源 [2]

检索任何相关时刻：通用时刻检索的基准和模型

检索任何相关时刻：通用时刻检索的基准和模型

相关实体

相关话题