Dépôt DSpace/Université Larbi Tébessi-Tébessa

A hybrid approach for offline handwritten and printed script identification

Afficher la notice abrégée

dc.contributor.author DJELAL, Rayane
dc.date.accessioned 2025-11-13T11:26:42Z
dc.date.available 2025-11-13T11:26:42Z
dc.date.issued 2025-06-10
dc.identifier.uri http//localhost:8080/jspui/handle/123456789/13494
dc.description.abstract With the growing need to understand and analyze multilingual documents, recognizing different of scripts has become a major challenge in optical character recognition (OCR) and automated document analysis. This work presents a hybrid approach that combines the power of deep learning for extracting visual features with the flexibility of traditional machine learning algorithms for classification. The proposed system is based on a deep neural network called YafNet, designed to extract specific features for each script from images of printed and handwritten words. These features are then used by several classic classifiers, especially logistic regression, to identify the script type. The methodology includes several steps: image preprocessing, deep feature extraction with YafNet, and classification using eight different models (such as SVM, RF, KNN, XGBoost, etc.). Experiments were conducted on the MDIW-13 dataset, which includes 13 different writing systems, covering both printed and handwritten texts. The YafNet–LogReg model performed well in all scenarios, successfully distinguishing between various script types. Error analysis showed that most confusions occurred between visually or linguistically similar scripts, highlighting the difficulty of the task and the strength of the proposed method. This study shows the benefit of combining deep representations from CNNs with simple and efficient classifiers. The proposed approach can be applied in real-world use cases such as automatic language or script identification in document archiving systems, or as a first step in multilingual OCR systems. en_US
dc.language.iso en en_US
dc.publisher University of Echahid Cheikh Larbi Tébessi -Tébessa en_US
dc.subject Script identification, Deep learning, Machine learning, Hybrid approach, Multilingual text recognition, Handwritten and printed text classification. en_US
dc.title A hybrid approach for offline handwritten and printed script identification en_US
dc.type Thesis en_US


Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée