Researchers have introduced CoMET-Bench, a new benchmark designed for Conditional Multi-Event Temporal Grounding in long-form videos. Existing benchmarks are insufficient as they typically localize only a single event or treat grounding and counting as separate tasks. CoMET-Bench includes a large dataset with complex queries and proposes a unified evaluation protocol with a new Rejection-F1 metric to address limitations in current methods. A proposed agentic framework, CoMET-Agent, demonstrated improved performance over GPT-5 by reformulating the task as structured search and aggregation. AI
RANK_REASON The cluster contains a research paper introducing a new benchmark and methodology for video temporal grounding. [lever_c_demoted from research: ic=1 ai=1.0]
- alphaXiv
- arXiv
- CatalyzeX
- CoMET-Agent
- CoMET-Bench
- Connected Papers
- CORE Recommender
- DagsHub
- Gotit.pub
- GPT-5
- Hugging Face
- Litmaps
- ScienceCast
- scite Smart Citations
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →