[HTML][HTML] Pearson correlation-based feature selection for document classification using balanced training

IM Nasir, MA Khan, M Yasmin, JH Shah, M Gabryel… - Sensors, 2020 - mdpi.com
Documents are stored in a digital form across several organizations. Printing this amount of
data and placing it into folders instead of storing digitally is against the practical, economical …

Information extraction from text intensive and visually rich banking documents

B Oral, E Emekligil, S Arslan, G Eryiǧit - Information Processing & …, 2020 - Elsevier
Document types, where visual and textual information plays an important role in their
analysis and understanding, pose a new and attractive area for information extraction …

A multi-modal approach to digital document stream segmentation for title insurance domain

A Guha, A Alahmadi, D Samanta, MZ Khan… - IEEE …, 2022 - ieeexplore.ieee.org
In the twenty-first century, storing and managing digital documents has become
commonplace for all corporate and public sectors around the world. Physical documents are …

Beyond Document Page Classification: Design, Datasets, and Challenges

J Van Landeghem, S Biswas… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper highlights the need to bring document classification benchmarking closer to real-
world applications, both in the nature of data tested (X: multi-channel, multi-paged, multi …

Sequence-aware multimodal page classification of Brazilian legal documents

PH Luz de Araujo, APGS de Almeida… - International Journal on …, 2023 - Springer
Abstract The Brazilian Supreme Court receives tens of thousands of cases each semester.
Court employees spend thousands of hours to execute the initial analysis and classification …

Leveraging effectiveness and efficiency in page stream deep segmentation

FA Braz, NC da Silva, JAS Lima - Engineering Applications of Artificial …, 2021 - Elsevier
The separation of documents contained in a page stream is a critical activity in some
segments. That is the case of the Brazilian judiciary system since it is overwhelmed with files …

A hybrid web analytic approach through click enabled vision based page segmentation in quest software for school students

R Muruganandham, A Sheik Abdullah… - Journal of Intelligent …, 2022 - content.iospress.com
The primary goal of this study is to optimize web content for a positive user experience and
to develop a data-driven methodology to assess the success of visitor flow on a website for …

[HTML][HTML] Using Deep-Learned Vector Representations for Page Stream Segmentation by Agglomerative Clustering

L Busch, R van Heusden, M Marx - Algorithms, 2023 - mdpi.com
Page stream segmentation (PSS) is the task of retrieving the boundaries that separate
source documents given a consecutive stream of documents (for example, sequentially …

Tab this folder of documents: page stream segmentation of business documents

T Mungmeeprued, Y Ma, N Mehta… - Proceedings of the 22nd …, 2022 - dl.acm.org
In the midst of digital transformation, automatically understanding the structure and
composition of scanned documents is important in order to allow correct indexing, archiving …

Bcubed revisited: elements like me

R van Heusden, J Kamps, M Marx - Proceedings of the 2022 acm sigir …, 2022 - dl.acm.org
BCubed is a mathematically clean, elegant and intuitively well behaved external
performance metric for clustering tasks. BCubed compares a predicted clustering to a known …