Trocr: Transformer-based optical character recognition with pre-trained models
Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …
approaches are usually built based on CNN for image understanding and RNN for char …
Text detection, recognition, and script identification in natural scene images: A Review
V Naosekpam, N Sahu - International Journal of Multimedia Information …, 2022 - Springer
Text in natural scene images plays a vital role in scene understanding. It contains a rich and
abundant amount of valuable semantic information useful in many applications such as …
abundant amount of valuable semantic information useful in many applications such as …
Scene text recognition with permuted autoregressive sequence models
D Bautista, R Atienza - European conference on computer vision, 2022 - Springer
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
Svtr: Scene text recognition with a single visual model
Dominant scene text recognition models commonly contain two building blocks, a visual
model for feature extraction and a sequence model for text transcription. This hybrid …
model for feature extraction and a sequence model for text transcription. This hybrid …
Multi-granularity prediction for scene text recognition
Scene text recognition (STR) has been an active research topic in computer vision for years.
To tackle this challenging problem, numerous innovative methods have been successively …
To tackle this challenging problem, numerous innovative methods have been successively …
Dan: a segmentation-free document attention network for handwritten document recognition
D Coquenet, C Chatelain… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Unconstrained handwritten text recognition is a challenging computer vision task. It is
traditionally handled by a two-step approach, combining line segmentation followed by text …
traditionally handled by a two-step approach, combining line segmentation followed by text …
Multi-modal text recognition networks: Interactive enhancements between visual and semantic features
Linguistic knowledge has brought great benefits to scene text recognition by providing
semantics to refine character sequences. However, since linguistic knowledge has been …
semantics to refine character sequences. However, since linguistic knowledge has been …
LISTER: Neighbor decoding for length-insensitive scene text recognition
The diversity in length constitutes a significant characteristic of text. Due to the long-tail
distribution of text lengths, most existing methods for scene text recognition (STR) only work …
distribution of text lengths, most existing methods for scene text recognition (STR) only work …
Dtrocr: Decoder-only transformer for optical character recognition
M Fujitake - Proceedings of the IEEE/CVF Winter …, 2024 - openaccess.thecvf.com
Typical text recognition methods rely on an encoder-decoder structure, in which the encoder
extracts features from an image, and the decoder produces recognized text from these …
extracts features from an image, and the decoder produces recognized text from these …