PP-OCRv3: More attempts for the improvement of ultra lightweight OCR system

Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui… - arXiv preprint arXiv …, 2024 - arxiv.org

In this report, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …

被引用次数：172 相关文章所有 2 个版本

[PDF] thecvf.com

Towards robust tampered text detection in document image: New dataset and new solution

C Qu, C Liu, Y Liu, X Chen, D Peng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, tampered text detection in document image has attracted increasingly attention
due to its essential role on information security. However, detecting visually consistent …

被引用次数：20 相关文章所有 5 个版本

[PDF] thecvf.com

Modeling entities as semantic points for visual information extraction in the wild

Z Yang, R Long, P Wang, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Recently, Visual Information Extraction (VIE) has been becoming increasingly
important in both academia and industry, due to the wide range of real-world applications …

被引用次数：7 相关文章所有 5 个版本

[PDF] mdpi.com

Uav localization in low-altitude gnss-denied environments based on poi and store signage text matching in uav images

Y Liu, J Bai, G Wang, X Wu, F Sun, Z Guo, H Geng - Drones, 2023 - mdpi.com

Localization is the most important basic information for unmanned aerial vehicles (UAV)
during their missions. Currently, most UAVs use GNSS to calculate their own position …

被引用次数：9 相关文章所有 4 个版本

[PDF] thecvf.com

Bridging the Gap Between End-to-End and Two-Step Text Spotting

M Huang, H Li, Y Liu, X Bai… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Modularity plays a crucial role in the development and maintenance of complex systems.
While end-to-end text spotting efficiently mitigates the issues of error accumulation and sub …

Context perception parallel decoder for scene text recognition

Y Du, Z Chen, C Jia, X Yin, C Li, Y Du… - arXiv preprint arXiv …, 2023 - arxiv.org

Scene text recognition (STR) methods have struggled to attain high accuracy and fast
inference speed. Autoregressive (AR)-based STR model uses the previously recognized …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Toward real text manipulation detection: New dataset and new solution

D Luo, Y Liu, R Yang, X Liu, J Zeng, Y Zhou, X Bai - Pattern Recognition, 2025 - Elsevier

With the surge in realistic text tampering, detecting fraudulent text in images has gained
prominence for maintaining information security. However, the high costs associated with …

被引用次数：1 相关文章所有 3 个版本

[PDF] thecvf.com

ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

C Duan, P Fu, S Guo, Q Jiang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

In recent years text-image joint pre-training techniques have shown promising results in
various tasks. However in Optical Character Recognition (OCR) tasks aligning text instances …

被引用次数：1 相关文章所有 3 个版本

[PDF] mdpi.com

A Comprehensive Framework for Industrial Sticker Information Recognition Using Advanced OCR and Object Detection Techniques

G Monteiro, L Camelo, G Aquino, RA Fernandes… - Applied Sciences, 2023 - mdpi.com

Recent advancements in Artificial Intelligence (AI), deep learning (DL), and computer vision
have revolutionized various industrial processes through image classification and object …

被引用次数：9 相关文章所有 5 个版本

ESGNet: A multimodal network model incorporating entity semantic graphs for information extraction from Chinese resumes

S Luo, J Yu - Information Processing & Management, 2024 - Elsevier

Corporations require screening critical information from numerous resumes with different
formats and content for managerial decision-making. However, traditional manual screening …

被引用次数：2 相关文章所有 2 个版本