关注
Jingqun Tang
Jingqun Tang
ByteDance Inc.
在 bytedance.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Few could be better than all: Feature sampling and grouping for scene text detection
J Tang, W Zhang, H Liu, MK Yang, B Jiang, G Hu, X Bai
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
732022
Spts v2: single-point scene text spotting
Y Liu, J Zhang, D Peng, M Huang, X Wang, J Tang, C Huang, D Lin, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
272023
Docpedia: Unleashing the power of large multimodal model in the frequency domain for versatile document understanding
H Feng, Q Liu, H Liu, W Zhou, H Li, C Huang
arXiv preprint arXiv:2311.11810, 2023
222023
Unidoc: A universal large multimodal model for simultaneous text detection, recognition, spotting and understanding
H Feng, Z Wang, J Tang, J Lu, W Zhou, H Li, C Huang
arXiv preprint arXiv:2308.11592, 2023
192023
You can even annotate text with voice: Transcription-only-supervised text spotting
J Tang, S Qiao, B Cui, Y Ma, S Zhang, D Kanoulas
Proceedings of the 30th ACM International Conference on Multimedia, 4154-4163, 2022
112022
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
J Tang, C Lin, Z Zhao, S Wei, B Wu, Q Liu, H Feng, Y Li, S Wang, L Liao, ...
arXiv preprint arXiv:2404.12803, 2024
52024
Optimal boxes: boosting end-to-end scene text recognition by adjusting annotated bounding boxes via reinforcement learning
J Tang, W Qian, L Song, X Dong, L Li, X Bai
European Conference on Computer Vision, 233-248, 2022
32022
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
J Tang, Q Liu, Y Ye, J Lu, S Wei, C Lin, W Li, MFFB Mahmood, H Feng, ...
arXiv preprint arXiv:2405.11985, 2024
22024
Harmonizing Visual Text Comprehension and Generation
Z Zhao, J Tang, B Wu, C Lin, S Wei, H Liu, X Tan, Z Zhang, C Huang, ...
arXiv preprint arXiv:2407.16364, 2024
2024
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
J Lu, H Yu, Y Wang, Y Ye, J Tang, Z Yang, B Wu, Q Liu, H Feng, H Wang, ...
arXiv preprint arXiv:2407.01976, 2024
2024
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
W Zhao, H Feng, Q Liu, J Tang, S Wei, B Wu, L Liao, Y Ye, H Liu, H Li, ...
arXiv preprint arXiv:2406.01326, 2024
2024
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
Z Zhao, C Huang, B Wu, C Lin, H Liu, Z Zhang, X Tan, J Tang, Y Xie
(CVPR2024)arXiv preprint arXiv:2311.13120, 2023
2023
Character recognition competition for street view shop signs
J Tang, W Du, B Wang, W Zhou, S Mei, T Xue, X Xu, H Zhang
National Science Review 10 (6), nwad141, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–13