Self-supervised learning for videos: A survey
The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
Scene text detection and recognition: The deep learning era
With the rise and development of deep learning, computer vision has been tremendously
transformed and reshaped. As an important research area in computer vision, scene text …
transformed and reshaped. As an important research area in computer vision, scene text …
Redet: A rotation-equivariant detector for aerial object detection
Recently, object detection in aerial images has gained much attention in computer vision.
Different from objects in natural images, aerial objects are often distributed with arbitrary …
Different from objects in natural images, aerial objects are often distributed with arbitrary …
Trocr: Transformer-based optical character recognition with pre-trained models
Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …
approaches are usually built based on CNN for image understanding and RNN for char …
Ocr-free document understanding transformer
Understanding document images (eg, invoices) is a core but challenging task since it
requires complex functions such as reading text and a holistic understanding of the …
requires complex functions such as reading text and a holistic understanding of the …
Scene text recognition with permuted autoregressive sequence models
D Bautista, R Atienza - European conference on computer vision, 2022 - Springer
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition
Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively
model linguistic rules in end-to-end deep networks remains a research challenge. In this …
model linguistic rules in end-to-end deep networks remains a research challenge. In this …
Learning RoI transformer for oriented object detection in aerial images
Object detection in aerial images is an active yet challenging task in computer vision
because of the bird's-eye view perspective, the highly complex backgrounds, and the variant …
because of the bird's-eye view perspective, the highly complex backgrounds, and the variant …
What is wrong with scene text recognition model comparisons? dataset and model analysis
Many new proposals for scene text recognition (STR) models have been introduced in
recent years. While each claim to have pushed the boundary of the technology, a holistic …
recent years. While each claim to have pushed the boundary of the technology, a holistic …
Towards accurate scene text recognition with semantic reasoning networks
Scene text image contains two levels of contents: visual texture and semantic information.
Although the previous scene text recognition methods have made great progress over the …
Although the previous scene text recognition methods have made great progress over the …