[HTML][HTML] Pearson correlation-based feature selection for document classification using balanced training
Documents are stored in a digital form across several organizations. Printing this amount of
data and placing it into folders instead of storing digitally is against the practical, economical …
data and placing it into folders instead of storing digitally is against the practical, economical …
Information extraction from text intensive and visually rich banking documents
Document types, where visual and textual information plays an important role in their
analysis and understanding, pose a new and attractive area for information extraction …
analysis and understanding, pose a new and attractive area for information extraction …
A multi-modal approach to digital document stream segmentation for title insurance domain
In the twenty-first century, storing and managing digital documents has become
commonplace for all corporate and public sectors around the world. Physical documents are …
commonplace for all corporate and public sectors around the world. Physical documents are …
Beyond Document Page Classification: Design, Datasets, and Challenges
J Van Landeghem, S Biswas… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper highlights the need to bring document classification benchmarking closer to real-
world applications, both in the nature of data tested (X: multi-channel, multi-paged, multi …
world applications, both in the nature of data tested (X: multi-channel, multi-paged, multi …
Sequence-aware multimodal page classification of Brazilian legal documents
PH Luz de Araujo, APGS de Almeida… - International Journal on …, 2023 - Springer
Abstract The Brazilian Supreme Court receives tens of thousands of cases each semester.
Court employees spend thousands of hours to execute the initial analysis and classification …
Court employees spend thousands of hours to execute the initial analysis and classification …
Leveraging effectiveness and efficiency in page stream deep segmentation
FA Braz, NC da Silva, JAS Lima - Engineering Applications of Artificial …, 2021 - Elsevier
The separation of documents contained in a page stream is a critical activity in some
segments. That is the case of the Brazilian judiciary system since it is overwhelmed with files …
segments. That is the case of the Brazilian judiciary system since it is overwhelmed with files …
A hybrid web analytic approach through click enabled vision based page segmentation in quest software for school students
R Muruganandham, A Sheik Abdullah… - Journal of Intelligent …, 2022 - content.iospress.com
The primary goal of this study is to optimize web content for a positive user experience and
to develop a data-driven methodology to assess the success of visitor flow on a website for …
to develop a data-driven methodology to assess the success of visitor flow on a website for …
[HTML][HTML] Using Deep-Learned Vector Representations for Page Stream Segmentation by Agglomerative Clustering
L Busch, R van Heusden, M Marx - Algorithms, 2023 - mdpi.com
Page stream segmentation (PSS) is the task of retrieving the boundaries that separate
source documents given a consecutive stream of documents (for example, sequentially …
source documents given a consecutive stream of documents (for example, sequentially …
Tab this folder of documents: page stream segmentation of business documents
T Mungmeeprued, Y Ma, N Mehta… - Proceedings of the 22nd …, 2022 - dl.acm.org
In the midst of digital transformation, automatically understanding the structure and
composition of scanned documents is important in order to allow correct indexing, archiving …
composition of scanned documents is important in order to allow correct indexing, archiving …
Bcubed revisited: elements like me
BCubed is a mathematically clean, elegant and intuitively well behaved external
performance metric for clustering tasks. BCubed compares a predicted clustering to a known …
performance metric for clustering tasks. BCubed compares a predicted clustering to a known …