Document parsing unveiled: Techniques, challenges, and prospects for structured information extraction
Document parsing is essential for converting unstructured and semi-structured documents-
such as contracts, academic papers, and invoices-into structured, machine-readable data …
such as contracts, academic papers, and invoices-into structured, machine-readable data …
From Detection to Application: Recent Advances in Understanding Scientific Tables and Figures
Tables and figures are usually used to present information in a structured and visual way in
scientific documents. Understanding the tables and figures in scientific documents is …
scientific documents. Understanding the tables and figures in scientific documents is …
Lineformer: Line chart data extraction using instance segmentation
Data extraction from line-chart images is an essential component of the automated
document understanding process, as line charts are a ubiquitous data visualization format …
document understanding process, as line charts are a ubiquitous data visualization format …
Swin-chart: An efficient approach for chart classification
A Dhote, M Javed, DS Doermann - Pattern Recognition Letters, 2024 - Elsevier
Charts are a visualization tool used in scientific documents to facilitate easy comprehension
of complex relationships underlying data and experiments. Researchers use various chart …
of complex relationships underlying data and experiments. Researchers use various chart …
SciOL and MuLMS-Img: Introducing A Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain
In scientific publications, a substantial part of the information is expressed via figures
containing images and diagrams. Hence, the retrieval of relevant figures given a natural …
containing images and diagrams. Hence, the retrieval of relevant figures given a natural …
Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset
The conventional approach to image recognition has been based on raster graphics, which
can suffer from aliasing and information loss when scaled up or down. In this paper, we …
can suffer from aliasing and information loss when scaled up or down. In this paper, we …
C3E: A framework for chart classification and content extraction
Incorporating charts into technical documents enhances richness by simplifying complex
data representation and improving comprehension. However, automated chart content …
data representation and improving comprehension. However, automated chart content …
A survey and approach to chart classification
A Dhote, M Javed, DS Doermann - International Conference on Document …, 2023 - Springer
Charts represent an essential source of visual information in documents and facilitate a
deep understanding and interpretation of information typically conveyed numerically. In the …
deep understanding and interpretation of information typically conveyed numerically. In the …
Text Role Classification in Scientific Charts Using Multimodal Transformers
Text role classification involves classifying the semantic role of textual elements within
scientific charts. We propose to finetune the multimodal document layout analysis models …
scientific charts. We propose to finetune the multimodal document layout analysis models …
SpaDen: Sparse and Dense Keypoint Estimation for Real-World Chart Understanding
We introduce a novel bottom-up approach for the extraction of chart data. Our model utilizes
images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct …
images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct …