New Japanese image-text dataset boosts AI cultural understanding

By PulseAugur Editorial · [1 sources] · 2026-06-02 04:00

Researchers have introduced WAON, a large-scale Japanese image-text dataset comprising approximately 155 million examples sourced from native Japanese web content. This dataset aims to improve the cultural understanding of contrastive vision-language models. Alongside WAON, they developed WAON-Bench, a curated benchmark for Japanese cultural understanding with 374 classes. Experiments show that models fine-tuned on WAON outperform those trained on translated English data for Japanese cultural tasks. AI

IMPACT Enables development of AI models with improved understanding of Japanese culture and language nuances.

RANK_REASON The cluster describes a new academic paper introducing a dataset and benchmark for AI research. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New Japanese image-text dataset boosts AI cultural understanding

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Issa Sugiura, Shuhei Kurita, Yusuke Oda, Daisuke Kawahara, Yasuo Okabe, Naoaki Okazaki · 2026-06-02 04:00

WAON: A Large-Scale Japanese Image-Text Dataset for Cultural Adaptation in Contrastive Vision-Language Models

arXiv:2510.22276v3 Announce Type: replace-cross Abstract: Contrastive vision-language models have achieved remarkable progress through large-scale pretraining. Recent work has shown that removing English-only caption filters and pretraining on global data is effective for improvi…

COVERAGE [1]

WAON: A Large-Scale Japanese Image-Text Dataset for Cultural Adaptation in Contrastive Vision-Language Models

RELATED ENTITIES

RELATED TOPICS