English(EN) We benchmarked 18 LLMs on OCR (7k+ calls) — cheaper/old models oftentimes win. Full dataset + framework open-sourced. [R]

更旧、更便宜的LLM通常以更低的成本匹配高级OCR准确性

作者 PulseAugur 编辑部 · [1 个来源] · 2026-04-23 05:40

研究人员开源了一个新的基准测试和框架，用于评估18种不同大型语言模型（LLM）的光学字符识别（OCR）性能。他们的分析涉及超过7500次调用，结果显示，对于标准的OCR任务，旧的、成本较低的模型通常能以显著更低的成本匹配高级模型的准确性。该项目包括一个包含42份文档的数据集、一个排行榜以及一个供用户测试自己文档的工具，旨在帮助团队避免为OCR服务支付过高的费用。 AI

影响为OCR识别出具有成本效益的LLM解决方案，可能降低AI驱动的文档处理的运营成本。

排序理由用于LLM评估的开源基准测试和数据集发布。

在 r/MachineLearning 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/MachineLearning TIER_1 English(EN) · /u/TimoKerre · 2026-04-23 05:40

We benchmarked 18 LLMs on OCR (7k+ calls) — cheaper/old models oftentimes win. Full dataset + framework open-sourced. [R]

<div class="md">TLDR; We were overpaying for OCR, so we compared flagship models with cheaper and older models. New mini-bench + leaderboard. Free tool to test your own documents. Open Source. We’ve been looking at OCR / document extracti…

报道来源 [1]

We benchmarked 18 LLMs on OCR (7k+ calls) — cheaper/old models oftentimes win. Full dataset + framework open-sourced. [R]

相关实体

相关话题