ComProScanner adds VLM to extract materials data from figures

By PulseAugur Editorial · [1 sources] · 2026-06-02 04:00

Researchers have developed ComProScanner, an enhanced framework for extracting materials data from scientific literature. This updated version integrates vision-language models (VLMs) to process quantitative data presented in figures, a capability previously lacking in text and table-focused systems. Evaluations using Gemini-3-Flash-Preview demonstrated high accuracy and cost-effectiveness in extracting composition-property pairs from scientific charts and plots. AI

IMPACT Enables more comprehensive automated data extraction from scientific literature, potentially accelerating materials science research.

RANK_REASON Academic paper detailing a new method for data extraction. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Aritra Roy, Enrico Grisan, Chiara Gattinoni, John Buckeridge · 2026-06-02 04:00

Beyond Text and Tables: Vision-Language Model Integration in ComProScanner for Extracting Materials Data from Scientific Figures with High Accuracy

arXiv:2606.00065v1 Announce Type: cross Abstract: Automated extraction of materials composition-property data from scientific literature has advanced considerably with the development of large language model-based pipelines; however, existing frameworks remain limited to textual …

COVERAGE [1]

Beyond Text and Tables: Vision-Language Model Integration in ComProScanner for Extracting Materials Data from Scientific Figures with High Accuracy

RELATED ENTITIES

RELATED TOPICS