ENTITY optical character recognition

optical character recognition

PulseAugur coverage of optical character recognition — every cluster mentioning optical character recognition across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

17 over 90d

Releases · 30d

0 over 90d

Papers · 30d

13 over 90d

TIER MIX · 90D

research 3
tool 13
meme 1

TOPICS

SENTIMENT · 30D

6 day(s) with sentiment data

RECENT · PAGE 1/1 · 17 TOTAL

TOOL · CL_74626 · Jun 6 · 08:01

AI agent built to safely summarize patient discharge data

This article details the creation of an AI agent designed to summarize patient discharge information from PDF documents. The agent focuses on extracting structured data like diagnoses, medications, and allergies, priori…
MEME · CL_73569 · Jun 5 · 13:01

User seeks open-source workflow for editable text layers in images

A user on Reddit is seeking an open-source method to transform text within an image into editable layers, similar to features found in Canva or Ideogram. The desired workflow involves detecting text, reconstructing the …
RESEARCH · CL_70572 · Jun 3 · 15:41

AI automates Swiss initiative signature validation

Researchers have developed an AI-powered system to automate the analysis of handwritten signature lists used in Swiss popular initiatives. The proposed pipeline combines Optical Character Recognition (OCR) with writer r…
TOOL · CL_45082 · May 22 · 04:00

Large multimodal models show mixed results for medical image PHI detection

Researchers evaluated large multimodal models (LMMs) like GPT-4o and Gemini 2.5 Flash for detecting protected health information (PHI) in medical images. While LMMs showed improved text recognition (lower Word Error Rat…
TOOL · CL_44780 · May 22 · 04:00

Vision-Language Models enhance Italian parliamentary speech analysis

Researchers have developed a new pipeline using Vision-Language Models to improve the transcription and analysis of historical Italian parliamentary speeches. This approach leverages OCR for initial text extraction and …
TOOL · CL_38441 · May 19 · 05:39

AI automates healthcare data to improve clinical decision support

Modern healthcare faces a data liquidity problem, where a significant portion of patient information remains trapped in unstructured formats like scanned documents and free-text notes. This necessitates manual data entr…
RESEARCH · CL_34750 · May 16 · 16:01

AI logging gaps trigger $1.5M HIPAA fine for hospital

Healthcare organizations are facing significant HIPAA violations due to inadequate logging of AI system activity, leading to substantial fines. A recent case involved a hospital settling for $1.5 million because its AI …
TOOL · CL_20775 · May 7 · 04:00

Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

Researchers have developed a new metric called Consensus Entropy (CE) to assess the reliability of Optical Character Recognition (OCR) outputs from Vision-Language Models (VLMs). CE measures the agreement between multip…
RESEARCH · CL_18242 · May 5 · 15:56

New CC-OCR V2 benchmark reveals LMMs fall short in real-world document processing

A new benchmark, CC-OCR V2, has been released to evaluate Large Multimodal Models (LMMs) on real-world document processing tasks. The benchmark includes 7,093 challenging samples across five OCR-centric tracks, addressi…
TOOL · CL_15804 · May 5 · 04:00

AI classifies historical document pages for tailored content processing

Researchers have developed an AI-powered image classification system to automatically categorize pages from historical documents. This system aims to streamline the processing of digitized archives by identifying differ…
TOOL · CL_15586 · May 5 · 04:00

New OCR benchmark reveals accuracy doesn't guarantee RAG performance

A new benchmark has been developed to evaluate the robustness of Optical Character Recognition (OCR) systems specifically for Retrieval-Augmented Generation (RAG) applications. Current OCR benchmarks using character-lev…
TOOL · CL_10878 · Apr 30 · 17:00

Sun Finance boosts ID verification accuracy with generative AI on AWS

Sun Finance, a Latvian fintech company, has successfully automated its identity document extraction and fraud detection processes using generative AI on Amazon Web Services (AWS). The new system, developed in partnershi…
RESEARCH · CL_08205 · Apr 28 · 08:35

Researchers release dataset of AI-generated images from GPT-Image-2's first week

Researchers have released a dataset of over 10,000 images generated by OpenAI's GPT-Image-2, collected in the first week following its April 21, 2026 release. The dataset, sourced from Twitter/X, was curated using a mul…
RESEARCH · CL_06544 · Apr 28 · 04:00

iWatchRoad system uses YOLO to detect and map potholes for smart cities

Researchers have developed iWatchRoad, an end-to-end system designed for the scalable detection and geospatial visualization of potholes. The system utilizes a fine-tuned YOLO model for real-time pothole identification …
RESEARCH · CL_06492 · Apr 28 · 04:00

New dataset and methods tackle low-light scene text recognition challenges

Researchers have introduced LSTR, a large-scale dataset for low-light scene text recognition, and ESTR, a smaller evaluation set of real nighttime street scenes. They explored two approaches: fine-tuning existing OCR mo…
RESEARCH · CL_06398 · Apr 28 · 04:00

HalalBench benchmark tackles OCR challenges for multilingual food packaging ingredient extraction

Researchers have introduced HalalBench, a new multilingual benchmark designed to evaluate Optical Character Recognition (OCR) performance specifically on food packaging ingredient labels. The benchmark addresses the uni…
RESEARCH · CL_03553 · Apr 23 · 05:40

Older, cheaper LLMs often match premium OCR accuracy at lower cost

Researchers have open-sourced a new benchmark and framework for evaluating Optical Character Recognition (OCR) performance across 18 different large language models (LLMs). Their analysis, involving over 7,500 calls, re…

AI agent built to safely summarize patient discharge data

User seeks open-source workflow for editable text layers in images

AI automates Swiss initiative signature validation

Large multimodal models show mixed results for medical image PHI detection

Vision-Language Models enhance Italian parliamentary speech analysis

AI automates healthcare data to improve clinical decision support

AI logging gaps trigger $1.5M HIPAA fine for hospital

Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

New CC-OCR V2 benchmark reveals LMMs fall short in real-world document processing

AI classifies historical document pages for tailored content processing

New OCR benchmark reveals accuracy doesn't guarantee RAG performance

Sun Finance boosts ID verification accuracy with generative AI on AWS

Researchers release dataset of AI-generated images from GPT-Image-2's first week

iWatchRoad system uses YOLO to detect and map potholes for smart cities

New dataset and methods tackle low-light scene text recognition challenges

HalalBench benchmark tackles OCR challenges for multilingual food packaging ingredient extraction

Older, cheaper LLMs often match premium OCR accuracy at lower cost