PulseAugur
实时 23:45:45
实体 optical character recognition

optical character recognition

PulseAugur coverage of optical character recognition — every cluster mentioning optical character recognition across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
14
90 天内 14
发布 · 30天
0
90 天内 0
论文 · 30天
11
90 天内 11
层级分布 · 90 天
情绪 · 30 天

3 天有情绪数据

最近 · 第 1/1 页 · 共 14 条
  1. TOOL · CL_45082 ·

    Large multimodal models show mixed results for medical image PHI detection

    Researchers evaluated large multimodal models (LMMs) like GPT-4o and Gemini 2.5 Flash for detecting protected health information (PHI) in medical images. While LMMs showed improved text recognition (lower Word Error Rat…

  2. TOOL · CL_44780 ·

    Vision-Language Models enhance Italian parliamentary speech analysis

    Researchers have developed a new pipeline using Vision-Language Models to improve the transcription and analysis of historical Italian parliamentary speeches. This approach leverages OCR for initial text extraction and …

  3. TOOL · CL_38441 ·

    AI automates healthcare data to improve clinical decision support

    Modern healthcare faces a data liquidity problem, where a significant portion of patient information remains trapped in unstructured formats like scanned documents and free-text notes. This necessitates manual data entr…

  4. RESEARCH · CL_34750 ·

    AI logging gaps trigger $1.5M HIPAA fine for hospital

    Healthcare organizations are facing significant HIPAA violations due to inadequate logging of AI system activity, leading to substantial fines. A recent case involved a hospital settling for $1.5 million because its AI …

  5. TOOL · CL_20775 ·

    Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

    Researchers have developed a new metric called Consensus Entropy (CE) to assess the reliability of Optical Character Recognition (OCR) outputs from Vision-Language Models (VLMs). CE measures the agreement between multip…

  6. RESEARCH · CL_18242 ·

    New CC-OCR V2 benchmark reveals LMMs fall short in real-world document processing

    A new benchmark, CC-OCR V2, has been released to evaluate Large Multimodal Models (LMMs) on real-world document processing tasks. The benchmark includes 7,093 challenging samples across five OCR-centric tracks, addressi…

  7. TOOL · CL_15804 ·

    AI classifies historical document pages for tailored content processing

    Researchers have developed an AI-powered image classification system to automatically categorize pages from historical documents. This system aims to streamline the processing of digitized archives by identifying differ…

  8. TOOL · CL_15586 ·

    New OCR benchmark reveals accuracy doesn't guarantee RAG performance

    A new benchmark has been developed to evaluate the robustness of Optical Character Recognition (OCR) systems specifically for Retrieval-Augmented Generation (RAG) applications. Current OCR benchmarks using character-lev…

  9. TOOL · CL_10878 ·

    Sun Finance boosts ID verification accuracy with generative AI on AWS

    Sun Finance, a Latvian fintech company, has successfully automated its identity document extraction and fraud detection processes using generative AI on Amazon Web Services (AWS). The new system, developed in partnershi…

  10. RESEARCH · CL_08205 ·

    Researchers release dataset of AI-generated images from GPT-Image-2's first week

    Researchers have released a dataset of over 10,000 images generated by OpenAI's GPT-Image-2, collected in the first week following its April 21, 2026 release. The dataset, sourced from Twitter/X, was curated using a mul…

  11. RESEARCH · CL_06544 ·

    iWatchRoad system uses YOLO to detect and map potholes for smart cities

    Researchers have developed iWatchRoad, an end-to-end system designed for the scalable detection and geospatial visualization of potholes. The system utilizes a fine-tuned YOLO model for real-time pothole identification …

  12. RESEARCH · CL_06492 ·

    New dataset and methods tackle low-light scene text recognition challenges

    Researchers have introduced LSTR, a large-scale dataset for low-light scene text recognition, and ESTR, a smaller evaluation set of real nighttime street scenes. They explored two approaches: fine-tuning existing OCR mo…

  13. RESEARCH · CL_06398 ·

    HalalBench benchmark tackles OCR challenges for multilingual food packaging ingredient extraction

    Researchers have introduced HalalBench, a new multilingual benchmark designed to evaluate Optical Character Recognition (OCR) performance specifically on food packaging ingredient labels. The benchmark addresses the uni…

  14. RESEARCH · CL_03553 ·

    Older, cheaper LLMs often match premium OCR accuracy at lower cost

    Researchers have open-sourced a new benchmark and framework for evaluating Optical Character Recognition (OCR) performance across 18 different large language models (LLMs). Their analysis, involving over 7,500 calls, re…