Sinhala Character Predictor
An advanced machine learning project focused on optical character recognition (OCR) for the Sinhala script. This project implements neural network models to recognize and predict Sinhala characters from image inputs, addressing the unique challenges of non-Latin character recognition and script-specific pattern matching.
2023
Year
6
Features
6
Technologies
- ·Limited availability of labeled Sinhala character datasets
- ·Complexity of Sinhala script with diacritical marks
- ·Balancing model accuracy with inference speed
- ·Handling similar-looking characters effectively
- ·Created custom dataset through data collection and annotation
- ·Implemented specialized preprocessing for Sinhala script
- ·Used transfer learning to improve model performance
- ·Applied data augmentation to build robust models
Key Features
Convolutional neural network for image classification
Image preprocessing pipeline with normalization
Data augmentation for improved model robustness
Multi-layer neural network architecture
Real-time character prediction capability
Model evaluation with precision, recall, and F1 metrics
Technologies
Learnings
- →
Developed proficiency in neural network architecture design
- →
Gained expertise in image preprocessing and augmentation
- →
Learned language-specific ML challenges and solutions
- →
Mastered model evaluation and performance metrics
Highlights
Neural Network Model
OCR Implementation
Character Recognition
Next Project