Geolayoutlm: Geometric pre-training for visual information extraction
Visual information extraction (VIE) plays an important role in Document Intelligence.
Generally, it is divided into two tasks: semantic entity recognition (SER) and relation …
Generally, it is divided into two tasks: semantic entity recognition (SER) and relation …
Vision grid transformer for document layout analysis
Document pre-trained models and grid-based models have proven to be very effective on
various tasks in Document AI. However, for the document layout analysis (DLA) task …
various tasks in Document AI. However, for the document layout analysis (DLA) task …
Conditional text image generation with diffusion models
Y Zhu, Z Li, T Wang, M He… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current text recognition systems, including those for handwritten scripts and scene text, have
relied heavily on image synthesis and augmentation, since it is difficult to realize real-world …
relied heavily on image synthesis and augmentation, since it is difficult to realize real-world …
Cdistnet: Perceiving multi-domain character distance for robust text recognition
The transformer-based encoder-decoder framework is becoming popular in scene text
recognition, largely because it naturally integrates recognition clues from both visual and …
recognition, largely because it naturally integrates recognition clues from both visual and …
LISTER: Neighbor decoding for length-insensitive scene text recognition
The diversity in length constitutes a significant characteristic of text. Due to the long-tail
distribution of text lengths, most existing methods for scene text recognition (STR) only work …
distribution of text lengths, most existing methods for scene text recognition (STR) only work …
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing
B Zhang, H Xie, Z Gao, Y Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Scene text images contain not only style information (font background) but also content
information (character texture). Different scene text tasks need different information but …
information (character texture). Different scene text tasks need different information but …
Symmetrical linguistic feature distillation with clip for scene text recognition
In this paper, we explore the potential of the Contrastive Language-Image Pretraining (CLIP)
model in scene text recognition (STR), and establish a novel Symmetrical Linguistic Feature …
model in scene text recognition (STR), and establish a novel Symmetrical Linguistic Feature …
A survey of text detection and recognition algorithms based on deep learning technology
XF Wang, ZH He, K Wang, YF Wang, L Zou, ZZ Wu - Neurocomputing, 2023 - Elsevier
Abstract Optical Character Recognition (OCR) poses a crucial challenge within the realm of
computer vision research, as it plays a pivotal role in converting vast amounts of …
computer vision research, as it plays a pivotal role in converting vast amounts of …
Linguistic more: Taking a further step toward efficient and accurate scene text recognition
Vision model have gained increasing attention due to their simplicity and efficiency in Scene
Text Recognition (STR) task. However, due to lacking the perception of linguistic knowledge …
Text Recognition (STR) task. However, due to lacking the perception of linguistic knowledge …
Class-Aware Mask-guided feature refinement for scene text recognition
Scene text recognition is a rapidly developing field that faces numerous challenges due to
the complexity and diversity of scene text, including complex backgrounds, diverse fonts …
the complexity and diversity of scene text, including complex backgrounds, diverse fonts …