A set of benchmarks for handwritten text recognition on historical documents

JA Sánchez, V Romero, AH Toselli, M Villegas… - Pattern Recognition, 2019 - Elsevier
Abstract Handwritten Text Recognition is a important requirement in order to make visible
the contents of the myriads of historical documents residing in public and private archives …

A survey of historical document image datasets

K Nikolaidou, M Seuret, H Mokayed… - International Journal on …, 2022 - Springer
This paper presents a systematic literature review of image datasets for document image
analysis, focusing on historical documents, such as handwritten manuscripts and early …

Deep learning for historical document analysis and recognition—a survey

F Lombardi, S Marinai - Journal of Imaging, 2020 - mdpi.com
Nowadays, deep learning methods are employed in a broad range of research fields. The
analysis and recognition of historical documents, as we survey in this work, is not an …

Automatic processing of Historical Arabic Documents: a comprehensive survey

MI Khedher, H Jmila, MA El-Yacoubi - Pattern Recognition, 2020 - Elsevier
Nowadays, there is a huge amount of Historical Arabic Documents (HAD) in the national
libraries and archives around the world. Analyzing this type of data manually is a difficult and …

U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts

S Zottin, A De Nardin, E Colombi, C Piciarelli… - Neural Computing and …, 2024 - Springer
Abstract Document Layout Analysis, which is the task of identifying different semantic
regions inside of a document page, is a subject of great interest for both computer scientists …

Indiscapes: Instance segmentation networks for layout parsing of historical indic manuscripts

A Prusty, S Aitha, A Trivedi… - … on Document Analysis …, 2019 - ieeexplore.ieee.org
Historical palm-leaf manuscript and early paper documents from Indian subcontinent form
an important part of the world's literary and cultural heritage. Despite their importance, large …

BADAM: a public dataset for baseline detection in Arabic-script manuscripts

B Kiessling, DSB Ezra, MT Miller - … of the 5th International Workshop on …, 2019 - dl.acm.org
The application of handwritten text recognition to historical works is highly dependant on
accurate text line retrieval. A number of systems utilizing a robust baseline detection …

A modular region and text line layout analysis system

B Kiessling - 2020 17th International Conference on Frontiers in …, 2020 - ieeexplore.ieee.org
High quality document layout analysis is fundamental to the accurate processing of
handwritten textual material, on both the level of individual lines and higher order zones …

Word spotting using convolutional siamese network

BK Barakat, R Alasam, J El-Sana - 2018 13th IAPR …, 2018 - ieeexplore.ieee.org
We present a method for word spotting using convolutional siamese network. A
convolutional siamese network employs two identical convolutional network to rank …

The image and ground truth dataset of Mongolian movable-type newspapers for text recognition

M Lu, F Bao, H Zhang, G Gao - International Journal on Document …, 2024 - Springer
OCR approaches have been widely advanced in recent years thanks to the resurgence of
deep learning. However, to the best of our knowledge, there is little work on Mongolian …