Logo

dev-resources.site

for different kinds of informations.

๐Ÿ“โœจClearText

Published at
1/15/2025
Categories
devchallenge
githubchallenge
webdev
ai
Author
ajinkya_bobade_f1cf60e720
Author
25 person written this
ajinkya_bobade_f1cf60e720
open
๐Ÿ“โœจClearText

This is a submission for the GitHub Copilot Challenge : Transitions and Transformations

What I Built

I have built "ClearText" which is an AI-powered text detection and enhancement tool that makes text in images cleaner.

Title bar

It's Perfect For ๐ŸŽฏ

  • ๐Ÿ“„ Document Digitization
  • ๐Ÿ“š Book Scanning
  • ๐Ÿ“ฑ Mobile Photos of Text
  • ๐Ÿ–จ๏ธ Improving Scanned Documents
  • ๐Ÿ“‘ Text Enhancement in Images

Demo

ClearText Demo

Repo

Github Repository - ClearText

Here's an example of what ClearText can do:

ClearText Demo

Image description

ClearText takes input image (left hand side), removes all noise and outputs pure text (right hand side).

ClearText has a huge potential where it can be used in the following fields:

Document Processing ๐Ÿ“„

  • Banking & Finance
    • ๐Ÿฆ Check processing
    • ๐Ÿ“Š Financial statement digitization

Healthcare ๐Ÿฅ

  • Medical Records
    • ๐Ÿ“‹ Patient records digitization
    • ๐Ÿ”ฌ Lab report enhancement

Legal Industry โš–๏ธ

  • Document Management
    • ๐Ÿ“œ Contract digitization
    • ๐Ÿ—„๏ธ Case file processing

Academic Use Cases ๐Ÿ“š

  • ๐Ÿ“– Textbook scanning
  • ๐Ÿ“‘ Research paper digitization

Copilot Experience ๐Ÿค–

I used co-pilot extensively to complete this amazing project. Here are the ways in which co-pilot helped me :

Code Completion ๐Ÿ“

  • Auto-completed common OpenCV operations
  • Suggested image processing parameters
  • Completed function signatures for Streamlit components

Chat Assistance ๐Ÿ’ฌ

  • Debugged ONNX model loading issues
  • Explained image processing pipeline
  • Suggested optimizations for image transformations

Inline Suggestions โšก

  • Recommended error handling patterns
  • Suggested variable names and types

Model Switching ๐Ÿ”„

Used different models for specific tasks:

  • Code Completion: GitHub Copilot
  • Documentation: Claude
  • Debugging: GPT-4

Common Prompts Used ๐ŸŽฏ

# Function implementation
/explain image processing pipeline
/suggest error handling
/optimize performance
Enter fullscreen mode Exit fullscreen mode

Code Edits โœ๏ธ

  • Refactored image processing functions
  • Added blur/no-blur options
  • Improved error messages
  • Enhanced documentation

Project Evolution & Contributions

Building on Open Source

This project builds upon the excellent CRAFT text detection model by CLOVA AI Research, while making significant architectural and functional improvements:

1. Production-Ready Architecture ๐Ÿ—๏ธ

  • I converted the research-focused PyTorch model to production-ready ONNX format
  • Leveraged ONNX Runtime for optimized inference across different hardware
  • Added complete Docker containerization for reliable deployment

2. Enhanced Text Processing Pipeline ๐Ÿ”„

The original CRAFT model provides basic text detection. ClearText significantly expands on this by:

  • Adding custom image preprocessing for better text clarity
  • Implementing new post-processing transforms for enhanced output quality
  • Creating an entirely new text enhancement pipeline
  • Developing a user-friendly web interface for easy interaction

3. Major Output Improvements ๐Ÿ“ˆ

ClearText transforms the basic text detection output into a comprehensive text enhancement solution:

  • Original CRAFT: Basic text region detection
  • ClearText Additions:
    • Text clarity enhancement
    • Document digitization capabilities
    • Support for various document types (books, mobile photos, scanned documents)
    • Complete image processing pipeline

Transparency Statement

While this project builds upon CRAFT's foundational text detection capabilities, ClearText represents a significant evolution with entirely new functionality, architecture, and use cases. All original CRAFT code is properly credited and licensed under MIT License.

Conclusion

Developing ClearText during the GitHub Copilot 1-Day Build Challenge has been an amazing journey. Without co-pilot, transforming complex text detection model into an accessible, user-friendly web application would have been tremendously difficult. The project showcases how AI can bridge the gap between computer vision and practical, everyday use cases.

devchallenge Article's
30 articles in total
Favicon
Join us for the Agent.ai Challenge: $10,000 in Prizes!
Favicon
The Great Failure of 2024
Favicon
Boost Your Productivity with Momentum Builder: A Web App to Overcome Procrastination and Track Progress
Favicon
LinkedIn RoastMaster General
Favicon
ReadOnePage - Spend more time reading and learning, and less time in social media
Favicon
๐Ÿ“โœจClearText
Favicon
Impostor syndrome website: Copilot 1-Day Build Challenge
Favicon
Daily JavaScript Challenge #JS-74: Convert Hexadecimal to Binary
Favicon
Just git it!
Favicon
GitHub Copilot Challenge: Building a Habit Tracker App
Favicon
Planning for 2025
Favicon
Navigating 2025: My Tech Predictions
Favicon
Evolution By Sound
Favicon
Weekly Planner - API
Favicon
Amazon Product Finder
Favicon
My Journey to 2025: Reflections, Plans, and Predictions
Favicon
Finding the Perfect Destination in 24 Hours: My GitHub Copilot 1-Day Build Challenge Experience
Favicon
GitHub Copilot One Day Build Challenge: New Beginnings: An Integrated Productivity System
Favicon
Code Feeds for GitHub - AI Generated Instagram-style feeds
Favicon
๐Ÿš€ Weekly Angular Challenge: Two Projects a Week!
Favicon
๐Ÿš€ Weekly Angular Challenge: Two Projects a Week!
Favicon
Labels for any occasion
Favicon
Habit Tracker: A Web Application to Track Your Daily Habits
Favicon
Goal Setter App
Favicon
ZenFlow: Unlock Productivity with Work, Yoga, and Meditation
Favicon
Predicting 2025
Favicon
Compiling 2025
Favicon
SkillBytes - Gamified learning process using AI
Favicon
GitHub Copilot Challenge: Transitions and Transformations
Favicon
2024 Dev Rewind: Breaking Comfort Zones and Embracing the Unexpected

Featured ones: