Swindocsegmenter: An end-to-end unified domain adaptive transformer for document instance segmentation

A Banerjee, S Biswas, J Lladós, U Pal - International Conference on …, 2023 - Springer
Instance-level segmentation of documents consists in assigning a class-aware and instance-
aware label to each pixel of the image. It is a key step in document parsing for their …

Computer vision and machine learning approaches for metadata enrichment to improve searchability of historical newspaper collections

D Ali, K Milleville, S Verstockt… - Journal of …, 2024 - emerald.com
Purpose Historical newspaper collections provide a wealth of information about the past.
Although the digitization of these collections significantly improves their accessibility, a large …

Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model

R Elanwar, W Qin, M Betke, D Wijaya - International Journal on Document …, 2021 - Springer
Datasets of documents in Arabic are urgently needed to promote computer vision and
natural language processing research that addresses the specifics of the language …

Text and graphics segmentation of newspapers printed in Gurmukhi script: a hybrid approach

RP Kaur, MK Jindal, M Kumar - The Visual Computer, 2021 - Springer
Newspapers are always a standard medium to convey important information to masses of
people in recent time as well as in old time. An automated system is required to convert …

SemiDocSeg: harnessing semi-supervised learning for document layout analysis

A Banerjee, S Biswas, J Lladós, U Pal - International Journal on Document …, 2024 - Springer
Abstract Document Layout Analysis (DLA) is the process of automatically identifying and
categorizing the structural components (eg Text, Figure, Table, etc.) within a document to …

Comparative study of movie shot classification based on semantic segmentation

HY Bak, SB Park - Applied Sciences, 2020 - mdpi.com
The shot-type decision is a very important pre-task in movie analysis due to the vast
information, such as the emotion, psychology of the characters, and space information, from …

Accurate fine-grained layout analysis for the historical Tibetan document based on the instance segmentation

P Zhao, W Wang, Z Cai, G Zhang, Y Lu - IEEE Access, 2021 - ieeexplore.ieee.org
Accurate layout analysis without subsequent text-line segmentation remains an ongoing
challenge, especially when facing the Kangyur, a kind of historical Tibetan document …

Newspaper elements detection and newspaper pages categorization using CNNs and transformers

A Almutairi - International Journal on Document Analysis and …, 2024 - Springer
Newspaper digitization has gained wide interest around the world. Archives of digitized
newspapers and magazines contain a wealth of information that spans decades. To extract …

DCT-CompSegNet: fast layout segmentation in DCT compressed JPEG document images using deep feature learning

B Rajesh, SM Zaman, M Javed, M Lin - Multimedia Tools and Applications, 2024 - Springer
The problem of layout segmentation is still very challenging in document images like
newspapers, magazines, and research articles, that have both text and non-text components …

Toward a big data analysis system for historical newspaper collections research

SP Satheesan, Bhavya, A Davies, AB Craig… - Proceedings of the …, 2022 - dl.acm.org
The availability and generation of digitized newspaper collections have provided
researchers in several domains with a powerful tool to advance their research. More …