Dépôt DSpace/Université Larbi Tébessi-Tébessa

Contribution to historical documents classification

Afficher la notice abrégée

dc.contributor.author Boudraa, Merouane
dc.date.accessioned 2025-10-05T12:31:08Z
dc.date.available 2025-10-05T12:31:08Z
dc.date.issued 2025-05-15
dc.identifier.uri http//localhost:8080/jspui/handle/123456789/13276
dc.description.abstract Historical documents play a crucial role in unraveling the mysteries of history. However, studying these documents is a complex, time-consuming, and resource-intensive task, requiring significant expertise. With the rise of digital humanities and the availability of digital versions of documents, automating tasks related to these documents has become a possibility. One of the key challenges in this domain is “the classification of historical documents”. Early studies relied on traditional machine learning techniques that required manual feature extraction, but these methods often fell short in terms of accuracy. The advent of deep learning has opened new avenues for improving classification results, with various deep learning architectures being explored in this field. Despite the potential of deep learning, further exploration and the development of more efficient techniques are needed. This realization led us to contribute to the literature with our study on historical document classification. Our research provides a comprehensive exploration of the topic from both historical and computational perspectives. We began by understanding the complexities of historical document classification and analyzing existing studies. This analysis allowed us to identify key challenges and difficulties, guiding the direction of our contribution. Our approach begins with precise preprocessing techniques, including denoising via the Non-Local Means algorithm and binarization using the Canny edge detector. After these initial steps, we move into feature detection by applying the Harris corner detector to identify keypoints within the manuscript. These keypoints are then clustered using the k- means algorithm, allowing us to extract meaningful patches of predefined dimensions. This process not only isolates significant visual features but also supports systematic data augmentation to enhance the depth and diversity of our dataset. In the final stage, we harness the capabilities of vision transformers—a powerful deep learning architecture— finetuned to our specific classification tasks. By integrating automatically extracted handcrafted features, the model evolves into a robust and highly effective classification system for historical document analysis. As an added layer of refinement, we deploy a majority vote mechanism on image patches, meticulously engineered to heighten system accuracy. These findings have been validated and published in leading journals and conference proceedings, providing a strong foundation for future research in historical document analysis. en_US
dc.language.iso en en_US
dc.publisher Université Echahid Cheikh Larbi-Tebessi -Tébessa en_US
dc.subject Historical Documents, Machine Learning, Deep Learning, Classification, Dating, Writer Identification, Script Classification, Convolutional Neural Networks, Vision Transformers, Features Extraction, Computer Vision, Pattern Recognition en_US
dc.title Contribution to historical documents classification en_US
dc.type Thesis en_US


Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée

Chercher dans le dépôt


Recherche avancée

Parcourir

Mon compte