Structext: Structured text understanding with multi-modal transformers Y Li, Y Qian, Y Yu, X Qin, C Zhang, Y Liu, K Yao, J Han, J Liu, E Ding Proceedings of the 29th ACM International Conference on Multimedia, 1912-1920, 2021 | 115 | 2021 |
Group detr: Fast detr training with group-wise one-to-many assignment Q Chen, X Chen, J Wang, S Zhang, K Yao, H Feng, J Han, E Ding, G Zeng, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 85 | 2023 |
Vista: Vision and scene text aggregation for cross-modal retrieval M Cheng, Y Sun, L Wang, X Zhu, K Yao, J Chen, G Song, J Han, J Liu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 68 | 2022 |
Maskocr: Text recognition with masked encoder-decoder pretraining P Lyu, C Zhang, S Liu, M Qiao, Y Xu, L Wu, K Yao, J Han, E Ding, J Wang arXiv preprint arXiv:2206.00311, 2022 | 38 | 2022 |
Structextv2: Masked visual-textual prediction for document image pre-training Y Yu, Y Li, C Zhang, X Zhang, Z Guo, X Qin, K Yao, J Han, E Ding, J Wang arXiv preprint arXiv:2303.00289, 2023 | 37 | 2023 |
Cae v2: Context autoencoder with clip target X Zhang, J Chen, J Yuan, Q Chen, J Wang, X Wang, S Han, X Chen, J Pi, ... arXiv preprint arXiv:2211.09799, 2022 | 23 | 2022 |
Decoupling recognition from detection: Single shot self-reliant scene text spotter J Wu, P Lyu, G Lu, C Zhang, K Yao, W Pei Proceedings of the 30th ACM International Conference on Multimedia, 1319-1328, 2022 | 21 | 2022 |
Group pose: A simple baseline for end-to-end multi-person pose estimation H Liu, Q Chen, Z Tan, JJ Liu, J Wang, X Su, X Li, K Yao, J Han, E Ding, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 18 | 2023 |
Learning structure-guided diffusion model for 2d human pose estimation Z Qiu, Q Yang, J Wang, X Wang, C Xu, D Fu, K Yao, J Han, E Ding, ... arXiv preprint arXiv:2306.17074, 2023 | 13 | 2023 |
Froster: Frozen clip is a strong teacher for open-vocabulary action recognition X Huang, H Zhou, K Yao, K Han arXiv preprint arXiv:2402.03241, 2024 | 10 | 2024 |
Towards robust real-time scene text detection: From semantic to instance representation learning X Qin, P Lyu, C Zhang, Y Zhou, K Yao, P Zhang, H Lin, W Wang Proceedings of the 31st ACM International Conference on Multimedia, 2025-2034, 2023 | 7 | 2023 |
Trust: An accurate and end-to-end table structure recognizer using splitting-based transformers Z Guo, Y Yu, P Lv, C Zhang, H Li, Z Wang, K Yao, J Liu, J Wang arXiv preprint arXiv:2208.14687, 2022 | 7 | 2022 |
Hap: Structure-aware masked image modeling for human-centric perception J Yuan, X Zhang, H Zhou, J Wang, Z Qiu, Z Shao, S Zhang, S Long, ... Advances in Neural Information Processing Systems 36, 2024 | 6 | 2024 |
Gridformer: Towards accurate table structure recognition via grid prediction P Lyu, W Ma, H Wang, Y Yu, C Zhang, K Yao, Y Xue, J Wang Proceedings of the 31st ACM International Conference on Multimedia, 7747-7757, 2023 | 5 | 2023 |
Fast-StrucTexT: An efficient hourglass transformer with modality-guided dynamic token merge for document understanding M Zhai, Y Li, X Qin, C Yi, Q Xie, C Zhang, K Yao, Y Wu, Y Jia arXiv preprint arXiv:2305.11392, 2023 | 5 | 2023 |
Icdar 2023 competition on structured text extraction from visually-rich document images W Yu, C Zhang, H Cao, W Hua, B Li, H Chen, M Liu, M Chen, J Kuang, ... International Conference on Document Analysis and Recognition, 536-552, 2023 | 4 | 2023 |
CAE v2: Context autoencoder with CLIP latent alignment X Zhang, J Chen, J Yuan, Q Chen, J Wang, X Wang, S Han, X Chen, J Pi, ... Transactions on Machine Learning Research, 2023 | 4 | 2023 |
Matadoc: margin and text aware document dewarping for arbitrary boundary B Dai, Q Xie, Y Li, X Qin, C Zhang, K Yao, J Han arXiv preprint arXiv:2307.12571, 2023 | 2 | 2023 |
OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer Y Wang, X Su, Q Chen, X Zhang, T Xi, K Yao, E Ding, G Zhang, J Wang arXiv preprint arXiv:2407.10655, 2024 | 1 | 2024 |
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection Q Chen, X Su, X Zhang, J Wang, J Chen, Y Shen, C Han, Z Chen, W Xu, ... arXiv preprint arXiv:2406.03459, 2024 | 1 | 2024 |