PulseAugur
EN
LIVE 10:01:00

New framework enables zero-shot captioning of Indonesian traditional clothing

Researchers have developed Custom ZeroCLIP, a novel retrieval-augmented vision-language framework designed for the zero-shot captioning of traditional Indonesian clothing. This system utilizes a combination of CLIP and BERT text encoders with an LSTM caption decoder, trained on data from 24 Indonesian provinces and evaluated on 8 unseen provinces. The framework achieved strong performance with a CLIPScore of 0.8536, BLEU-4 of 0.3342, and METEOR of 0.4859, demonstrating significant improvements in cultural vocabulary recovery and overall accuracy, particularly in low-resource heritage contexts. AI

IMPACT Advances zero-shot captioning capabilities for cultural heritage data, potentially improving accessibility and analysis of specialized visual datasets.

RANK_REASON The cluster describes a research paper published on arXiv detailing a new framework for image analysis and captioning.

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CV TIER_1 English(EN) · Anugrah Aidin Yotolembah, Novanto Yudistira, Gembong Edhi Setyawan ·

    Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

    arXiv:2606.13275v1 Announce Type: new Abstract: This paper presents Custom ZeroCLIP, a retrieval-augmented vision-language framework for zero-shot captioning of Indonesian traditional garments. The dataset contains 3,800 expert-annotated images from all 38 Indonesian provinces. U…

  2. arXiv cs.CV TIER_1 English(EN) · Gembong Edhi Setyawan ·

    Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

    This paper presents Custom ZeroCLIP, a retrieval-augmented vision-language framework for zero-shot captioning of Indonesian traditional garments. The dataset contains 3,800 expert-annotated images from all 38 Indonesian provinces. Using a province-level inductive zero-shot protoc…