AI / ML2026completed

Sinhala Character Recognition

A machine learning application that recognizes handwritten Sinhala characters from image input. The system uses a K-Nearest Neighbors (KNN) classifier trained on preprocessed character samples, paired with a graphical interface so users can draw or upload characters and see predictions in real time. The project explores classical ML for script-specific recognition without deep learning.

PythonMachine LearningKNNOpenCVscikit-learn
View on GitHub
2018

Year

0

Features

0

Technologies

Problem
  • ·Limited availability of labeled Sinhala character datasets
  • ·Complexity of Sinhala script with diacritical marks
  • ·Choosing effective features and k for similar-looking characters
  • ·Making the tool usable through a clear graphical interface
Solution
  • ·Collected and labeled a custom Sinhala character dataset
  • ·Applied preprocessing tuned for handwritten Sinhala glyphs
  • ·Tuned KNN hyperparameters (k, distance metric) on validation data
  • ·Wrapped inference in a simple GUI for interactive testing

Key Features

K-Nearest Neighbors classifier for Sinhala character recognition

Handwritten character input via drawing canvas or image upload

Image preprocessing and feature extraction before classification

User-friendly graphical interface for live predictions

Configurable k parameter and model evaluation workflow

Support for Sinhala script-specific character classes

Technologies

Pythonscikit-learnOpenCVNumPyTkinter

Learnings

  • Implemented KNN classification for image-based character recognition

  • Built image preprocessing pipelines for handwritten input

  • Learned distance metrics and k-value tuning for classifier performance

  • Designed an accessible GUI for non-technical users to test the model

Highlights

KNN Classifier

Handwritten Recognition

GUI Application

Next Project

PDF RAG Assistant

Production-style RAG pipeline: upload PDFs, query with hybrid retrieval (vector + BM25), cross-encoder reranking, and context-grounded answers via FastAPI and Streamlit.