End-to-end video text detection with online tracking

H Yu, Y Huang, L Pi, C Zhang, X Li, L Wang - Pattern Recognition, 2021 - Elsevier
Text in videos usually acts as important semantic cues, which is helpful to video analysis.
Video text detection is considered as one of the most difficult tasks in document analysis due …

Semantic-aware video text detection

W Feng, F Yin, XY Zhang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Most existing video text detection methods track texts with appearance features, which are
easily influenced by the change of perspective and illumination. Compared with appearance …

End-to-end video text spotting with transformer

W Wu, Y Cai, C Shen, D Zhang, Y Fu, H Zhou… - International Journal of …, 2024 - Springer
Recent video text spotting methods usually require the three-staged pipeline, ie, detecting
text in individual images, recognizing localized text, tracking text streams with post …

Free: A fast and robust end-to-end video text spotter

Z Cheng, J Lu, B Zou, L Qiao, Y Xu, S Pu… - … on Image Processing, 2020 - ieeexplore.ieee.org
Currently, video text spotting tasks usually fall into the four-staged pipeline: detecting text
regions in individual images, recognizing localized text regions frame-wisely, tracking text …

Video text tracking with a spatio-temporal complementary model

Y Gao, X Li, J Zhang, Y Zhou, D Jin… - … on Image Processing, 2021 - ieeexplore.ieee.org
Text tracking is to track multiple texts in a video, and construct a trajectory for each text.
Existing methods tackle this task by utilizing the tracking-by-detection framework, ie …

You only recognize once: Towards fast video text spotting

Z Cheng, J Lu, Y Niu, S Pu, F Wu, S Zhou - Proceedings of the 27th ACM …, 2019 - dl.acm.org
Video text spotting is still an important research topic due to its various real-applications.
Previous approaches usually fall into the four-staged pipeline: text detection in individual …

Real-time end-to-end video text spotter with contrastive representation learning

W Wu, Z Li, J Li, C Shen, H Zhou, S Li, Z Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Video text spotting (VTS) is the task that requires simultaneously detecting, tracking and
recognizing text in the video. Existing video text spotting methods typically develop …

Scale-residual learning network for scene text detection

Y Cai, C Liu, P Cheng, D Du, L Zhang… - … on Circuits and …, 2020 - ieeexplore.ieee.org
Detecting incidentally captured text in the wild remains an open problem due to challenging
factors including unconstrained scenarios and large scale variation. In this paper, we …

A new deep wavefront based model for text localization in 3D video

L Nandanwar, P Shivakumara… - … on Circuits and …, 2021 - ieeexplore.ieee.org
With the evolution of electronic devices, such as 3D cameras, addressing the challenges of
text localization in 3D video (eg, for indexing) is increasingly drawing the attention of the …

Towards accurate video text spotting with text-wise semantic reasoning

X Zu, H Yu, B Li, X Xue - Proceedings of the Thirty-Second International …, 2023 - dl.acm.org
Video text spotting (VTS) aims at extracting texts from videos, where text detection, tracking
and recognition are conducted simultaneously. There have been some works that can tackle …