Text line segmentation in historical document images using an adaptive u-net architecture

O Mechi, M Mehri, R Ingold… - … Conference on Document …, 2019 - ieeexplore.ieee.org
On most document image transcription, indexing and retrieval systems, text line
segmentation remains one of the most important preliminary task. Hence, the research …

Robust text line detection in historical documents: learning and evaluation methods

M Boillet, C Kermorvant, T Paquet - International Journal on Document …, 2022 - Springer
Text line segmentation is one of the key steps in historical document understanding. It is
challenging due to the variety of fonts, contents, writing styles and the quality of documents …

Multiple document datasets pre-training improves text line detection with deep neural networks

M Boillet, C Kermorvant… - 2020 25th International …, 2021 - ieeexplore.ieee.org
In this paper, we introduce a fully convolutional network for the document layout analysis
task. While state-of-the-art methods are using models pre-trained on natural scene images …

A two-step framework for text line segmentation in historical Arabic and Latin document images

O Mechi, M Mehri, R Ingold… - International Journal on …, 2021 - Springer
One of the most important preliminary tasks in a transcription system of historical document
images is text line segmentation. Nevertheless, this task remains complex due to the …

[HTML][HTML] Text line extraction in historical documents using Mask R-CNN

A Droby, B Kurar Barakat, R Alaasam, B Madi… - Signals, 2022 - mdpi.com
Text line extraction is an essential preprocessing step in many handwritten document image
analysis tasks. It includes detecting text lines in a document image and segmenting the …

Indiscapes: Instance segmentation networks for layout parsing of historical indic manuscripts

A Prusty, S Aitha, A Trivedi… - … on Document Analysis …, 2019 - ieeexplore.ieee.org
Historical palm-leaf manuscript and early paper documents from Indian subcontinent form
an important part of the world's literary and cultural heritage. Despite their importance, large …

Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model

R Elanwar, W Qin, M Betke, D Wijaya - International Journal on Document …, 2021 - Springer
Datasets of documents in Arabic are urgently needed to promote computer vision and
natural language processing research that addresses the specifics of the language …

CNN-based Methods for Offline Arabic Handwriting Recognition: A Review

M El Khayati, I Kich, Y Taouil - Neural Processing Letters, 2024 - Springer
Abstract Arabic Handwriting Recognition (AHR) is a complex task involving the
transformation of handwritten Arabic text from image format into machine-readable data …

BADAM: a public dataset for baseline detection in Arabic-script manuscripts

B Kiessling, DSB Ezra, MT Miller - … of the 5th International Workshop on …, 2019 - dl.acm.org
The application of handwritten text recognition to historical works is highly dependant on
accurate text line retrieval. A number of systems utilizing a robust baseline detection …

[HTML][HTML] Learning-free, divide and conquer text-line extraction algorithm for printed Arabic text with diacritics

A Qaroush, A Awad, A Hanani, K Mohammad… - Journal of King Saud …, 2022 - Elsevier
The extraction of text lines from document images is a critical step in optical character
recognition. It is still considered an open document analysis problem. The presence of …