COMBINER: Composed Image Retrieval Guided by Attribute-based Neighbor Relations
Researchers have developed a new method called COMBINER for Composed Image Retrieval (CIR), which aims to improve the accuracy of locating specific images using multimodal inputs. This approach addresses challenges where visually similar images may have different attributes, by creating a unified representation of cross-modal features based on attribute prototypes. COMBINER utilizes modules for adaptive semantic disentanglement, unified prototype-based composition, and dual relations modeling to better understand semantic relationships between samples. AI