Reason-Then-Retrieve for CoVR-R with Structured Edit Prompts and Dense-Sparse Fusion
Researchers have developed a novel zero-shot pipeline for reason-aware composed video retrieval, named CoVR-R. This system utilizes the Qwen3.5-27B model to infer target videos based on edit instructions applied to reference videos. The approach combines dense retrieval with a TF-IDF branch, achieving strong performance on validation and test splits with metrics like R@1 reaching up to 89.73. AI
IMPACT Introduces a novel approach to video retrieval that leverages reasoning capabilities, potentially improving search accuracy for complex video edits.