PulseAugur
EN
LIVE 11:44:01

New Composed Object Retrieval task enables object-level image search

Researchers have introduced Composed Object Retrieval (COR), a novel task designed to enable object-level retrieval within images using composed expressions. Unlike existing Composed Image Retrieval (CIR) methods that match entire images, COR focuses on localizing specific objects and grounding them with pixel-level masks. This new task requires models to perform complex visual-textual reasoning to identify desired modifications to reference objects, even when faced with visually similar distractors. To support this task, a new benchmark called COR125K has been created, featuring over 125,000 retrieval triplets across numerous categories. The proposed CORE model demonstrates significant improvements over current CIR pipelines and baselines, establishing a new foundation for fine-grained object-level multimodal retrieval. AI

IMPACT This research could lead to more precise and nuanced image search capabilities, improving applications that require fine-grained visual content understanding.

RANK_REASON The cluster describes a new research paper introducing a novel task and benchmark for object-level image retrieval. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New Composed Object Retrieval task enables object-level image search

COVERAGE [1]

  1. arXiv cs.CV TIER_1 (CA) · Tong Wang, Guanyu Yang, Nian Liu, Zongyan Han, Jinxing Zhou, Salman Khan, Fahad Shahbaz Khan ·

    Composed Object Retrieval: Object-level Retrieval via Composed Expressions

    arXiv:2508.04424v3 Announce Type: replace Abstract: Retrieving fine-grained visual content based on user intent remains a challenge in multimodal systems. Although current Composed Image Retrieval (CIR) methods combine reference images with retrieval texts, they are constrained t…