HakushoBench: A Japanese Chart and Table VQA Benchmark from Governmental White Papers
Researchers have developed HakushoBench, a new benchmark for evaluating vision-language models (VLMs) on their ability to understand Japanese charts and tables. The dataset is derived from 33 Japanese governmental white papers, containing over 2,000 images and manually annotated question-answer pairs. Initial experiments show a significant performance gap between open-weight and proprietary models, indicating substantial room for improvement in VLM capabilities for complex, non-English document analysis. AI
IMPACT Establishes a new evaluation standard for VLM performance on non-English visual data, potentially driving improvements in multilingual document understanding.