Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 18h

Show, Don't Ask: Generative Visual Disambiguation for Composed Image Retrieval with Turn-Valid Coverage

Researchers have introduced CLARA, a novel framework designed to address ambiguity in composed image retrieval (CIR). Unlike previous methods that rely on text-based clarification, CLARA presents users with a small selection of visual alternatives. This approach allows users to directly select the image that best matches their intent, bypassing the need for the model to predict textual answers. CLARA maintains conformal guarantees across multiple interaction rounds by reweighting calibration based on user selections and ensuring displayed prototypes are grounded in real corpus images. AI

IMPACT This research could improve user experience and accuracy in image search applications by offering a more intuitive disambiguation process.

Hugging Face
arXiv
Composed image retrieval
DagsHub
alphaXiv
ScienceCast
CatalyzeX
Gotit.pub
CLARA