PulseAugur
实时 12:14:52

Open-source VLMs evaluated for grocery product retrieval accuracy

A new paper evaluates 190 open-source vision-language models (VLMs) on the task of grocery product retrieval, a crucial component for checkout-free retail. The research found that data quality is more important than model scale for achieving accuracy gains. The study also highlighted that smaller, efficient models can outperform larger ones if trained on cleaner data, and introduced a new metric called 'semantic power density' to measure model efficiency. Despite strong performance in recalling relevant items, current state-of-the-art models struggle with precisely ranking visually similar products. AI

影响 Identifies key factors for improving grocery product retrieval accuracy with open-source VLMs, potentially impacting retail automation.

排序理由 Academic paper evaluating open-source models on a specific task. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Open-source VLMs evaluated for grocery product retrieval accuracy

报道来源 [1]

  1. arXiv cs.CV TIER_1 English(EN) · Rowel O. Atienza ·

    What Matters for Grocery Product Retrieval with Open Source Vision Language Models

    Multimodal product retrieval (MPR) underpins checkout-free retail and automated inventory systems, yet it demands fine-grained SKU discrimination that standard vision-language benchmarks fail to capture. We present the first systematic zero-shot evaluation of 190 open-source VLMs…