New framework enables zero-shot captioning of Indonesian traditional clothing

By PulseAugur Editorial · [2 sources] · 2026-06-11 12:29

Researchers have developed Custom ZeroCLIP, a novel retrieval-augmented vision-language framework designed for the zero-shot captioning of traditional Indonesian clothing. This system utilizes a combination of CLIP and BERT text encoders with an LSTM caption decoder, trained on data from 24 Indonesian provinces and evaluated on 8 unseen provinces. The framework achieved strong performance with a CLIPScore of 0.8536, BLEU-4 of 0.3342, and METEOR of 0.4859, demonstrating significant improvements in cultural vocabulary recovery and overall accuracy, particularly in low-resource heritage contexts. AI

IMPACT Advances zero-shot captioning capabilities for cultural heritage data, potentially improving accessibility and analysis of specialized visual datasets.

RANK_REASON The cluster describes a research paper published on arXiv detailing a new framework for image analysis and captioning.

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.CV TIER_1 English(EN) · Anugrah Aidin Yotolembah, Novanto Yudistira, Gembong Edhi Setyawan · 2026-06-12 04:00

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

arXiv:2606.13275v1 Announce Type: new Abstract: This paper presents Custom ZeroCLIP, a retrieval-augmented vision-language framework for zero-shot captioning of Indonesian traditional garments. The dataset contains 3,800 expert-annotated images from all 38 Indonesian provinces. U…
arXiv cs.CV TIER_1 English(EN) · Gembong Edhi Setyawan · 2026-06-11 12:29

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

This paper presents Custom ZeroCLIP, a retrieval-augmented vision-language framework for zero-shot captioning of Indonesian traditional garments. The dataset contains 3,800 expert-annotated images from all 38 Indonesian provinces. Using a province-level inductive zero-shot protoc…

COVERAGE [2]

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

RELATED ENTITIES

RELATED TOPICS