PulseAugur
实时 04:33:04

Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

Researchers have developed a new metric called Consensus Entropy (CE) to assess the reliability of Optical Character Recognition (OCR) outputs from Vision-Language Models (VLMs). CE measures the agreement between multiple VLMs, hypothesizing that correct predictions will have consistent outputs while errors will diverge. This metric is training-free and can be integrated into a framework called CE-OCR, which uses ensemble agreement to verify and select high-quality OCR results, reportedly improving F1 scores by over 42% compared to using a VLM as a judge. AI

影响 Introduces a novel, training-free method for improving the quality and reliability of OCR outputs from VLMs, potentially enhancing data generation for LLM training.

排序理由 The cluster contains an academic paper detailing a new method for evaluating OCR outputs from VLMs. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

报道来源 [1]

  1. arXiv cs.CV TIER_1 English(EN) · Yulong Zhang, Tianyi Liang, Xinyue Huang, Erfei Cui, Guoqing Wang, Xu Guo, Chenhui Li, Gongshen Liu ·

    Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR

    arXiv:2504.11101v4 Announce Type: replace Abstract: Optical Character Recognition (OCR) is fundamental to Vision-Language Models (VLMs) and high-quality data generation for LLM training. Yet, despite progress in average OCR accuracy, state-of-the-art VLMs still struggle with dete…