English(EN) Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

新框架支持印度尼西亚传统服装的零样本描述

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-12 04:00

研究人员开发了Custom ZeroCLIP，一个新颖的检索增强视觉语言框架，专为印度尼西亚传统服装的零样本描述而设计。该框架结合了CLIP和BERT文本编码器以及LSTM解码器，并在一个包含3800张专家标注图像的数据集上进行了训练。通过采用省级归纳零样本协议，该模型在未见过（未训练过）的省份上表现出色，实现了0.8536的CLIPScore，优于现有基线。 AI

影响这项研究推进了针对专业文化遗产数据集的零样本学习能力，有望提高AI理解和描述多样化文化文物的能力。

排序理由该集群描述了一篇在arXiv上发表的研究论文，其中详细介绍了一个新的图像分析框架。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Anugrah Aidin Yotolembah, Novanto Yudistira, Gembong Edhi Setyawan · 2026-06-12 04:00

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

arXiv:2606.13275v1 Announce Type: new Abstract: This paper presents Custom ZeroCLIP, a retrieval-augmented vision-language framework for zero-shot captioning of Indonesian traditional garments. The dataset contains 3,800 expert-annotated images from all 38 Indonesian provinces. U…

报道来源 [1]

Zero-Shot Captioning for Cultural Heritage: Automated Image Analysis of Traditional Indonesian Clothing

相关实体

相关话题