Researchers have introduced ReTrack, a novel framework designed to enhance composed video retrieval (CVR). CVR involves retrieving videos based on a query that includes a reference video and a text description of desired modifications. ReTrack addresses the challenge of information imbalance between video and text modalities, which often biases retrieval towards the reference video. The framework employs a dual-stream network with modules for semantic disentanglement, composition geometry calibration, and evidence-driven alignment to improve understanding of multi-modal queries and achieve state-of-the-art performance on both CVR and composed image retrieval tasks. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON This is a research paper detailing a new framework for video retrieval, not a release from a major lab or a significant industry event.