A hybrid approach for offline handwritten and printed script identification

DJELAL, Rayane

Accueil de DSpace
→
01.Master
→
3-Faculté des Science Exactes et des Sciences de la Nature et de la Vie
→
3- إعلام آلي
→
Voir le document

A hybrid approach for offline handwritten and printed script identification

DJELAL, Rayane

URI: http//localhost:8080/jspui/handle/123456789/13494

Date: 2025-06-10

Résumé:

With the growing need to understand and analyze multilingual documents, recognizing different of scripts has become a major challenge in optical character recognition (OCR) and automated document analysis. This work presents a hybrid approach that combines the power of deep learning for extracting visual features with the flexibility of traditional machine learning algorithms for classification. The proposed system is based on a deep neural network called YafNet, designed to extract specific features for each script from images of printed and handwritten words. These features are then used by several classic classifiers, especially logistic regression, to identify the script type. The methodology includes several steps: image preprocessing, deep feature extraction with YafNet, and classification using eight different models (such as SVM, RF, KNN, XGBoost, etc.). Experiments were conducted on the MDIW-13 dataset, which includes 13 different writing systems, covering both printed and handwritten texts. The YafNet–LogReg model performed well in all scenarios, successfully distinguishing between various script types. Error analysis showed that most confusions occurred between visually or linguistically similar scripts, highlighting the difficulty of the task and the strength of the proposed method. This study shows the benefit of combining deep representations from CNNs with simple and efficient classifiers. The proposed approach can be applied in real-world use cases such as automatic language or script identification in document archiving systems, or as a first step in multilingual OCR systems.

Afficher la notice complète