PulseAugur
LIVE 13:45:40
research · [3 sources] ·
0
research

TEMA architecture improves composed image retrieval with multi-modification capabilities

Researchers have introduced TEMA, a novel Text-oriented Entity Mapping Architecture designed to improve Composed Image Retrieval (CIR). This new framework addresses limitations in existing CIR systems, such as insufficient entity coverage and clause-entity misalignment, by effectively handling multi-modification text queries. To support this, two new datasets, M-FashionIQ and M-CIRR, have been created, and the system demonstrates superior performance across various benchmarks while maintaining efficiency. AI

Summary written by gemini-2.5-flash-lite from 3 sources. How we write summaries →

IMPACT Enhances image retrieval capabilities by enabling more complex, multi-faceted text-based image modifications.

RANK_REASON Academic paper introducing a new architecture and datasets for image retrieval.

Read on Hugging Face Daily Papers →

TEMA architecture improves composed image retrieval with multi-modification capabilities

COVERAGE [3]

  1. Hugging Face Daily Papers TIER_1 ·

    TEMA: Anchor the Image, Follow the Text for Multi-Modification Composed Image Retrieval

    Composed Image Retrieval (CIR) is an important image retrieval paradigm that enables users to retrieve a target image using a multimodal query that consists of a reference image and modification text. Although research on CIR has made significant progress, prevailing setups still…

  2. arXiv cs.CV TIER_1 · Zixu Li, Yupeng Hu, Zhiheng Fu, Zhiwei Chen, Yongqi Li, Liqiang Nie ·

    TEMA: Anchor the Image, Follow the Text for Multi-Modification Composed Image Retrieval

    arXiv:2604.21806v2 Announce Type: replace Abstract: Composed Image Retrieval (CIR) is an important image retrieval paradigm that enables users to retrieve a target image using a multimodal query that consists of a reference image and modification text. Although research on CIR ha…

  3. arXiv cs.CV TIER_1 · Liqiang Nie ·

    TEMA: Anchor the Image, Follow the Text for Multi-Modification Composed Image Retrieval

    Composed Image Retrieval (CIR) is an important image retrieval paradigm that enables users to retrieve a target image using a multimodal query that consists of a reference image and modification text. Although research on CIR has made significant progress, prevailing setups still…