Afficher la notice abrégée
dc.contributor.author |
Boudraa, Merouane |
|
dc.date.accessioned |
2025-10-05T12:31:08Z |
|
dc.date.available |
2025-10-05T12:31:08Z |
|
dc.date.issued |
2025-05-15 |
|
dc.identifier.uri |
http//localhost:8080/jspui/handle/123456789/13276 |
|
dc.description.abstract |
Historical documents play a crucial role in unraveling the mysteries of history.
However, studying these documents is a complex, time-consuming, and resource-intensive
task, requiring significant expertise. With the rise of digital humanities and the availability
of digital versions of documents, automating tasks related to these documents has become a
possibility. One of the key challenges in this domain is “the classification of historical
documents”. Early studies relied on traditional machine learning techniques that required
manual feature extraction, but these methods often fell short in terms of accuracy. The
advent of deep learning has opened new avenues for improving classification results, with
various deep learning architectures being explored in this field. Despite the potential of deep
learning, further exploration and the development of more efficient techniques are needed.
This realization led us to contribute to the literature with our study on historical document
classification. Our research provides a comprehensive exploration of the topic from both
historical and computational perspectives. We began by understanding the complexities of
historical document classification and analyzing existing studies. This analysis allowed us
to identify key challenges and difficulties, guiding the direction of our contribution.
Our approach begins with precise preprocessing techniques, including denoising via the
Non-Local Means algorithm and binarization using the Canny edge detector. After these
initial steps, we move into feature detection by applying the Harris corner detector to
identify keypoints within the manuscript. These keypoints are then clustered using the k-
means algorithm, allowing us to extract meaningful patches of predefined dimensions. This
process not only isolates significant visual features but also supports systematic data
augmentation to enhance the depth and diversity of our dataset. In the final stage, we
harness the capabilities of vision transformers—a powerful deep learning architecture—
finetuned to our specific classification tasks. By integrating automatically extracted
handcrafted features, the model evolves into a robust and highly effective classification
system for historical document analysis. As an added layer of refinement, we deploy a
majority vote mechanism on image patches, meticulously engineered to heighten system
accuracy. These findings have been validated and published in leading journals and
conference proceedings, providing a strong foundation for future research in historical
document analysis. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Université Echahid Cheikh Larbi-Tebessi -Tébessa |
en_US |
dc.subject |
Historical Documents, Machine Learning, Deep Learning, Classification, Dating, Writer Identification, Script Classification, Convolutional Neural Networks, Vision Transformers, Features Extraction, Computer Vision, Pattern Recognition |
en_US |
dc.title |
Contribution to historical documents classification |
en_US |
dc.type |
Thesis |
en_US |
Fichier(s) constituant ce document
Ce document figure dans la(les) collection(s) suivante(s)
Afficher la notice abrégée