License Plate Character OCR

Project Photos

Project Description

Introduction
This project focused on developing an Optical Character Recognition (OCR) model as part of an Automatic License Plate Recognition (ALPR) pipeline. The goal was to accurately recognize individual license plate characters using a Convolutional Neural Network (CNN), with performance improved through transfer learning. The work addressed challenges related to low image quality, diverse fonts, and character ambiguity, particularly common in real-world license plate datasets.

My Role
As part of a collaborative machine learning team, I contributed to model development, dataset creation, image preprocessing, and model evaluation.

Process & Approach
We began by training a CNN from scratch on a clean OCR dataset, achieving high accuracy (~98%). To evaluate its generalization, we applied it to a custom dataset of single characters cropped from real license plates, which we manually assembled using a YOLOv8 segmentation model for plate detection. The domain shift led to a performance drop (59.04% accuracy), prompting the application of transfer learning.

We first retrained only the classifier layer of the pre-trained model, raising accuracy to 70.75%. Fine-tuning all layers with a low learning rate further increased performance to 73.58%. Key challenges included distinguishing visually similar characters like 'O' vs. '0' and 'I' vs. '1', which remained difficult even for human labelers. Despite the small dataset size, we achieved meaningful improvements by refining preprocessing steps and preserving learned features from the original domain.

Insights
This project demonstrated the effectiveness of transfer learning for adapting OCR models to low-data, domain-shifted applications. It also highlighted the value of combining traditional CNN pipelines with modern object detection tools (YOLO) for rapid dataset creation. The results lay a strong foundation for future ALPR systems involving character-level recognition under real-world constraints.