Datalab has launched lift, a 9B parameter open-weights vision model designed for structured data extraction from PDFs and images. The model takes a JSON schema as input and generates a JSON object conforming to that schema, achieving 90.2% field accuracy on a benchmark dataset. lift processes entire multi-page documents in a single pass and offers schema-constrained decoding to ensure structural validity of the output. AI
IMPACT This model could streamline data extraction workflows for businesses by providing a self-hostable, open-source solution for converting unstructured document data into structured JSON.
RANK_REASON Research release of a new open-weights model with performance metrics.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →