English(EN) NEST: Narrative Event Structures in Time for Long Video Understanding

新的NEST数据集以叙事理解挑战长视频AI模型

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-18 02:05

研究人员推出了NEST，一个旨在评估长视频模型叙事理解能力的新数据集。NEST包含1005部完整电影，每部电影都标注了超过100个多模态叙事事件，这些事件通过时间、层级和长距离依赖关系相互连接。该数据集旨在超越简单的检索任务，评估模型如何理解复杂的叙事结构，包括跨越长时间的因果关系和重构事件。初步的基线结果显示，模型在事件检测和论元提取方面面临显著挑战，尽管事件关系提取显示出更大的潜力。 AI

影响为评估AI模型中的长视频理解能力引入了一个具有挑战性的新基准，推动了叙事理解的边界。

排序理由该集群描述了一个用于评估AI模型的新学术数据集和基准，发布在arXiv的研究论文中。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Ali Asgarov, Kaushik Narasimhan, Najibul Haque Sarker, Hani Alomari, Chia-Wei Tang, Anushka Sivakumar, Zaber Ibn Abdul Hakim, Shaurya Mallampati, Chris Thomas · 2026-06-19 04:00

NEST: Narrative Event Structures in Time for Long Video Understanding

arXiv:2606.19706v1 Announce Type: cross Abstract: Recent progress in vision-language models has enabled the processing of increasingly long video sequences, but the ability to handle extended token streams does not translate to understanding of narrative structure in long videos.…
arXiv cs.CL TIER_1 English(EN) · Chris Thomas · 2026-06-18 02:05

NEST: 用于长视频理解的叙事事件时间结构

Recent progress in vision-language models has enabled the processing of increasingly long video sequences, but the ability to handle extended token streams does not translate to understanding of narrative structure in long videos. Existing long video benchmarks focus on needle-in…

报道来源 [2]

NEST: Narrative Event Structures in Time for Long Video Understanding

NEST: 用于长视频理解的叙事事件时间结构

相关实体

相关话题