Digitizing History: Transitioning Historical Paper Documents to Digital Content for Information Retrieval and Mining—A Comprehensive Survey

N Girdhar, M Coustaty, A Doucet - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Historical document processing (HDP) corresponds to the task of converting the physical-
bind form of historical archives into a web-based centrally digitized form for their …

Leveraging Collection-Wide Similarities for Unsupervised Document Structure Extraction

G Lior, Y Goldberg, G Stanovsky - arXiv preprint arXiv:2402.13906, 2024 - arxiv.org
Document collections of various domains, eg, legal, medical, or financial, often share some
underlying collection-wide structure, which captures information that can aid both human …

(Mis) Matching Metadata: Improving Accessibility in Digital Visual Archives through the EyCon Project

K Aske, M Giardinetti - ACM Journal on Computing and Cultural …, 2023 - dl.acm.org
Discussing the current AHRC/LABEX-funded EyCon (Early Conflict Photography 1890–
1918 and Visual AI) project, this article considers potentially problematic metadata and how …

[HTML][HTML] Line-level layout recognition of historical documents with background knowledge

N Fischer, A Hartelt, F Puppe - Algorithms, 2023 - mdpi.com
Digitization and transcription of historic documents offer new research opportunities for
humanists and are the topics of many edition projects. However, manual work is still …

Layout Analysis of Punjabi Newspapers Using Contour Detection and Deep Learning-Based Model

A Kumar, GS Lehal - Advances in Networks, Intelligence and …, 2024 - taylorfrancis.com
Layout analysis of the newspaper to segment the newspaper image into various text and
graphic regions. Various applications of layout analysis are used in OCR to identify text …

Faster CNN-Based Layout Analysis of Punjabi Newspapers Using the Custom Dataset

A Kumar, GS Lehal - International Conference on Human-Centric Smart …, 2023 - Springer
Layout analysis is an important step in the recognition of text from scanned newspapers. In
this paper, we have collected newspapers from the Punjabi Tribune. These newspaper …