Brief · PulseAugur

RESEARCH · arXiv cs.LG English(EN) · 23h · [4 sources]

Reason, Retrieve, Re-rank: A Zero-Shot Reasoning-Aware Framework for Composed Video Retrieval

Researchers have developed new frameworks for zero-shot composed video retrieval, a task that involves finding a target video based on a reference video and a textual modification instruction. These methods, presented at the CVPR 2026 VidLLMs workshop, utilize frozen foundation models to reason about the implied changes and re-rank potential candidates. One approach, R3-CoVR, achieved high accuracy by using a multimodal LLM to generate post-edit descriptions and a constraint-aware re-ranker, while another, R^3, focuses on reasoning-guided recalling and re-ranking. AI

IMPACT Introduces new methods for video retrieval that leverage LLMs for reasoning, potentially improving search accuracy and flexibility.

DINOv3
Qwen3-VL-8B
SigLIP-2
CVPR 2026 VidLLMs workshop
R3-CoVR