New Benchmark Evaluates VLMs on Extracting Data from Epidemic Curves

By PulseAugur Editorial · [2 sources] · 2026-05-26 15:48

Researchers have introduced EpiCurveBench, a new benchmark designed to evaluate vision-language models (VLMs) on the task of extracting data from epidemic curve charts. This benchmark includes 1,000 real-world epidemic curve images and a novel evaluation metric called EpiCurveSimilarity (ECS). ECS is designed to better capture the temporal structure of time-series data compared to existing key-value extraction metrics, by aligning predicted and ground-truth series using dynamic programming. Initial evaluations show that even the strongest VLMs achieve only 52.3% ECS, highlighting the challenges in this domain and the limitations of current evaluation methods. AI

IMPACT This benchmark and metric could lead to more accurate VLM performance evaluations for time-series chart extraction, with potential applications in public health data analysis.

RANK_REASON The cluster describes a new academic paper introducing a benchmark and evaluation metric for VLM performance on a specific task.

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New Benchmark Evaluates VLMs on Extracting Data from Epidemic Curves

COVERAGE [2]

arXiv cs.CL TIER_1 English(EN) · Thomas Berkane, Maimuna S. Majumder · 2026-05-27 04:00

EpiCurveBench: Evaluating VLMs on Epidemic Curve Digitization

arXiv:2605.27195v1 Announce Type: new Abstract: Chart-to-data extraction with vision-language models (VLMs) is increasingly evaluated on benchmarks that show diminishing headroom (frontier VLMs exceed 89% on ChartQA) and with metrics that treat extracted points as unordered key-v…
arXiv cs.CL TIER_1 English(EN) · Maimuna S. Majumder · 2026-05-26 15:48

EpiCurveBench: Evaluating VLMs on Epidemic Curve Digitization

Chart-to-data extraction with vision-language models (VLMs) is increasingly evaluated on benchmarks that show diminishing headroom (frontier VLMs exceed 89% on ChartQA) and with metrics that treat extracted points as unordered key-value pairs, ignoring the temporal structure of t…

COVERAGE [2]

EpiCurveBench: Evaluating VLMs on Epidemic Curve Digitization

EpiCurveBench: Evaluating VLMs on Epidemic Curve Digitization

RELATED ENTITIES

RELATED TOPICS