Researchers have introduced NEST, a new dataset designed to evaluate the narrative understanding capabilities of long video models. NEST comprises 1005 full-length movies, each annotated with over 100 multimodal narrative events that are linked through temporal, hierarchical, and long-range dependencies. The dataset aims to move beyond simple retrieval tasks to assess how models can comprehend complex narrative structures, including cause-and-effect relationships across extended periods and reframed events. Initial baseline results show significant challenges for models in event detection and argument extraction, though event relation extraction shows more promise. AI
IMPACT Introduces a challenging new benchmark for evaluating long-form video understanding in AI models, pushing the boundaries of narrative comprehension.
RANK_REASON The cluster describes a new academic dataset and benchmark for evaluating AI models, presented in a research paper on arXiv.
- alphaXiv
- arXiv
- CatalyzeX
- DagsHub
- NEST
- Gotit.pub
- Hugging Face
- ScienceCast
- Connected Papers
- Litmaps
- scite Smart Citations
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →