VietFashion: Benchmarking Sketch-Text Composed Image Retrieval for Cultural Outfits
Researchers have introduced VietFashion, a new benchmark designed for sketch-text composed image retrieval, specifically focusing on cultural outfits like the traditional Vietnamese áo dài. This benchmark utilizes a combination of hand-drawn sketches and textual descriptions to enable the retrieval of culturally significant garments, addressing the limitations of standard AI models in capturing subtle details. The dataset, comprising over 21,000 images, aims to challenge current retrieval methods by incorporating fine-grained cultural semantics and a multi-target retrieval setting to account for design intent ambiguity. AI
IMPACT This benchmark could advance fine-grained visual retrieval for specialized domains like fashion, potentially improving AI's understanding of cultural nuances.