Signboard OCR & Transliteration
Multilingual OCR pipeline using Faster R-CNN and an attention-based GRU encoder-decoder for Hindi-to-English transliteration. 92% detection, 89% transliteration accuracy.
CS725 — Foundations of Machine Learning · IIT Bombay · Aug – Nov 2024
GitHub: IITB_CS725
Overview
Reading signboards in regional Indian languages and translating them into English — a real-world OCR and NLP challenge that involves detection, recognition, and transliteration in one pipeline.
Pipeline
- Text Detection: Faster R-CNN for precise bounding-box localization of text regions in signboard images
- Text Recognition: EasyOCR for extracting text from detected regions across diverse regional scripts
- Transliteration: Attention-based GRU Encoder-Decoder model for Hindi → English, preserving phonetic and contextual meaning
- Preprocessing: binarization and noise reduction pipeline to improve OCR accuracy on real-world signboard photos
Results
- 92% detection accuracy on the evaluation dataset
- 89% transliteration accuracy on Hindi → English pairs
Stack
Python · PyTorch · Faster R-CNN · EasyOCR · GRU Encoder-Decoder · OpenCV