Researchers have developed and benchmarked an adaptive Optical Character Recognition (OCR) pipeline designed for digitizing retail bills across various commercial sectors. The system incorporates a CNN-based image enhancement module, an image quality analyzer, a feedback loop for iterative retries, and an NLP-based correction layer. Tested on a dataset of 360 retail bills, the pipeline achieved a Character Error Rate (CER) of 18.4% and a Word Error Rate (WER) of 27.6%, significantly outperforming the Raw Tesseract baseline and demonstrating a notable speed advantage over EasyOCR. AI
影响 Establishes a new benchmark for OCR in retail, potentially improving data extraction efficiency for businesses.
排序理由 Academic paper detailing a new OCR pipeline and its benchmarked performance.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →