PulseAugur
EN
LIVE 00:50:21

Datalab releases lift, a 9B open-weights vision model for structured PDF extraction

Datalab has launched lift, a 9B parameter open-weights vision model designed for structured data extraction from PDFs and images. The model takes a JSON schema as input and generates a JSON object conforming to that schema, achieving 90.2% field accuracy on a benchmark dataset. lift processes entire multi-page documents in a single pass and offers schema-constrained decoding to ensure structural validity of the output. AI

IMPACT This model could streamline data extraction workflows for businesses by providing a self-hostable, open-source solution for converting unstructured document data into structured JSON.

RANK_REASON Research release of a new open-weights model with performance metrics.

Read on MarkTechPost →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Datalab releases lift, a 9B open-weights vision model for structured PDF extraction

COVERAGE [2]

  1. MarkTechPost TIER_1 English(EN) · Asif Razzaq ·

    Datalab Releases lift: A 9B Open-Weights Vision Model That Extracts Structured JSON From PDFs Using Schemas

    <p>Datalab released lift, a 9B open-weights vision model that turns PDFs and images into schema-matching JSON. It uses schema-constrained decoding for valid structure and trained abstention to return null instead of hallucinating absent fields, scoring 90.2% field accuracy on a 2…

  2. Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] ·

    Datalab has released lift, a 9B open-weights vision model that extracts structured JSON from PDFs using JSON schemas. It achieves 90.2% field accuracy on a 225-

    Datalab has released lift, a 9B open-weights vision model that extracts structured JSON from PDFs using JSON schemas. It achieves 90.2% field accuracy on a 225-document benchmark and runs in 9.5 seconds per document. https://www. marktechpost.com/2026/06/23/da talab-releases-lift…