Logo

dev-resources.site

for different kinds of informations.

How to improve OCR accuracy ? | my 5-year experience

Published at
7/12/2024
Categories
ocr
opensource
android
Author
abanoubha
Categories
3 categories in total
ocr
open
opensource
open
android
open
Author
9 person written this
abanoubha
open
How to improve OCR accuracy ? | my 5-year experience

my experience with OCR technologies

I created my 1st image to text converting app on Oct 6th 2018, so it was 5+ years ago. I have been improving, learning, rewriting, iterating, experimenting on OCR technology since then.

I created all of these apps to extract text from images/photos:

After all these years, I have a simple thing to say "What is measured, improves". This quote is from "The Effective Executive" book by Peter F. Drucker.

ideas to improve OCR accuracy

Improving OCR accuracy of extracted text is not a small task. The obvious answer to how to improve text extraction from images is:

  • improve "traineddata" models
  • use auto correct; for example correct boxmg into boxing
  • use High DPI photo

I used to focus on all of these bullet points and more of them, such as:

  • pre-processing images/photo with
    • black and white filter
    • binarization with adaptive threshold
  • increase the DPI of the image artificially to be around 300 dpi
  • use the best models from tesseract OCR despite their large size

These ideas led me to improve performance and text accuracy to certain extent. Don't get me wrong! these tips and tricks me my apps run fast enough with good enough accuracy. But I see more accurate apps! for example, Google ML kit produces almost 99% accuracy in text extraction from clear images.

how to measure OCR accuracy improvement/progress ?

My measurements are not good enough. I need to follow "What is measured, improves" concept. I need to have a set of photos of papers to measure my app's accuracy against. I need a sample of photos that represents the real world use cases. Then I need to refactor and enhance the text extraction accuracy against this sample of images. So, people get the improvements in their daily tasks of typing a paper into digital document.

specifications of the image sample

I need to collect that image sample with the real world use cases in mind. So I need these images.

  • a photo of an old book, the paper is perfectly laid out on an even surface
  • a photo of an old book, the paper is warped as the book is open
  • a photo of a modern book with clear white background
  • a photo of a modern book with some image/illustration between paragraphs
  • a photo of an article written in Arabic with some words in English
  • a photo of an old yellowish book paper with a cursive font

This is the initial set of image specification of the collected photos. If you have a specific use case, send some photo samples to me on Twitter (x) or LinkedIn.

I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube, and GitHub.

ocr Article's
30 articles in total
Favicon
Quick and Dirty Document Analysis: Combining GOT-OCR and LLama in Python
Favicon
Pixtral Large: Revolutionizing Multimodal AI with Superior Performance
Favicon
Say goodbye to tedious data entry! The future of OCR is here, and it’s smarter than ever!
Favicon
Unlocking Text from Embedded-Font PDFs: A pytesseract OCR Tutorial
Favicon
Streamlining Healthcare Paperwork with AI-Powered OCR
Favicon
NoisOCR: A Python Library for Simulating Post-OCR Noisy Texts
Favicon
AI-driven OCR Revolutionizes Intelligent Layout Analysis with 24+ Labels
Favicon
πŸ“„ OCR Reader, πŸ” Analyzer, and πŸ’¬ Chat Assistant using πŸ”Ž Zerox, 🧠 GPT-4o, powered by πŸš€ AI/ML API
Favicon
Qu'est-ce qu'OCRULUS ?
Favicon
Practical Approaches to Key Information Extraction (Part 1)
Favicon
OCR Data Extraction Software: Exploring the Latest Innovations in 2024
Favicon
Developing a Desktop MRZ Scanner for Passports, IDs, and Visas with Dynamsoft C++ Capture Vision SDK
Favicon
Streamlining Operations with Cloud OCR: Leading Use Cases in Business Automation
Favicon
Implementing Efficient Mobile OCR: A Developer’s Guide
Favicon
Automating VIN Code Recognition with OCR Technology
Favicon
OCR Solutions Uncovered: How to Choose the Best for Different Use Cases
Favicon
Steps to Develop an Angular Passport MRZ Reader & Scanner
Favicon
Mastering Text Extraction from Multi-Page PDFs Using OCR API: A Step-by-Step Guide
Favicon
Efficient Driver's License Recognition with OCR API: Step-by-Step Tutorial
Favicon
How to improve OCR accuracy ? | my 5-year experience
Favicon
I ask for help
Favicon
Mastering Parcel Scanning with C++: Barcode and OCR Text Extraction
Favicon
Difference Between OCR and ICR | A Complete Guide
Favicon
dvantages of iCustoms OCR: AI Precision for Streamlined Customs Processes
Favicon
5 C# OCR Libraries commonly Used by Developers
Favicon
Understand How to Transform Images into Text Easily
Favicon
OCR with tesseract, python and pytesseract
Favicon
Build a serverless EU-Driving Licences OCR with Amazon Textract on AWS
Favicon
Secure OCR and Biometrics Integration in Angular
Favicon
Removendo Dados Sensiveis de Images

Featured ones: