Revisiting skeleton-based action recognition H Duan, Y Zhao, K Chen, D Lin, B Dai CVPR 2022, 2021 | 492 | 2021 |
Mmbench: Is your multi-modal model an all-around player? Y Liu, H Duan, Y Zhang, B Li, S Zhang, W Zhao, Y Yuan, J Wang, C He, ... arXiv preprint arXiv:2307.06281, 2023 | 243 | 2023 |
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark MMA Contributors GitHub repository, https://github.com/open-mmlab/mmaction2, 2020 | 163 | 2020 |
Internlm: A multilingual language model with progressively enhanced capabilities ILM Team Github Repository, https://github.com/InternLM/InternLM, 2023 | 131 | 2023 |
Omni-sourced webly-supervised learning for video recognition H Duan, Y Zhao, Y Xiong, W Liu, D Lin ECCV 2020, 2020 | 100 | 2020 |
PYSKL: Towards Good Practices for Skeleton Action Recognition H Duan, J Wang, K Chen, D Lin ACMMM 2022, 2022 | 94 | 2022 |
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition P Zhang, XDB Wang, Y Cao, C Xu, L Ouyang, Z Zhao, S Ding, S Zhang, ... arXiv preprint arXiv:2309.15112, 2023 | 82 | 2023 |
Opencompass: A universal evaluation platform for foundation models OC Contributors GitHub repository, https://github.com/open-compass/[opencompass/VLMEvalKit], 2023 | 81 | 2023 |
SRPGAN: perceptual generative adversarial network for single image super resolution B Wu, H Duan, Z Liu, G Sun arXiv preprint arXiv:1712.05927, 2017 | 78 | 2017 |
InternLM-XComposer2: Mastering free-form text-image composition and comprehension in vision-language large model X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ... arXiv preprint arXiv:2401.16420, 2024 | 53 | 2024 |
Dg-stgcn: Dynamic spatial-temporal modeling for skeleton-based action recognition H Duan, J Wang, K Chen, D Lin arXiv preprint arXiv:2210.05895, 2022 | 31 | 2022 |
MMAction Y Zhao, H Duan, Y Xiong, D Lin Github Repository, https://github.com/open-mmlab/mmaction, 2019 | 28 | 2019 |
OCSampler: Compressing Videos to One Clip with Single-step Sampling J Lin, H Duan, K Chen, D Lin, L Wang CVPR 2022, 2022 | 27 | 2022 |
Internlm2 technical report Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ... arXiv preprint arXiv:2403.17297, 2024 | 23 | 2024 |
Journeydb: A benchmark for generative image understanding K Sun, J Pan, Y Ge, H Li, H Duan, X Wu, R Zhang, A Zhou, Z Qin, Y Wang, ... NeurIPS 2023 Datasets, 2024 | 22 | 2024 |
Trb: a novel triplet representation for understanding 2d human body H Duan, KY Lin, S Jin, W Liu, C Qian, W Ouyang ICCV 2019, 2019 | 19 | 2019 |
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition H Duan, N Zhao, K Chen, D Lin CVPR 2022, 2022 | 18 | 2022 |
Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences Y Zhou, H Duan, A Rao, B Su, J Wang AAAI 2023, 2023 | 17 | 2023 |
Are We on the Right Way for Evaluating Large Vision-Language Models? L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ... arXiv preprint arXiv:2403.20330, 2024 | 12 | 2024 |
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ... arXiv preprint arXiv:2404.06512, 2024 | 10 | 2024 |