Shallow and deep convolutional networks for saliency prediction J Pan*, K McGuinness*, N O'Connor, E E Sayrol, X Giro-i-Nieto Computer Vision and Pattern Recognition (CVPR) 2016, 598-606, 2016 | 588 | 2016 |
Salgan: Visual saliency prediction with generative adversarial networks J Pan, CC Ferrer, K McGuinness, NE O'Connor, J Torres, E Sayrol, ... CVPR Scene Understanding Workshop (SUNw) 2017, 2017 | 553 | 2017 |
Internvideo: General video foundation models via generative and discriminative learning Y Wang, K Li, Y Li, Y He, B Huang, Z Zhao, H Zhang, J Xu, Y Liu, Z Wang, ... arXiv preprint arXiv:2212.03191, 2022 | 204 | 2022 |
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning J Pan, Z Lin, X Zhu, J Shao, H Li Advances in Neural Information Processing Systems (NeurIPS), 2022 | 155 | 2022 |
Actor-context-actor relation network for spatio-temporal action localization J Pan, S Chen, MZ Shou, Y Liu, J Shao, H Li Conference on Computer Vision and Pattern Recognition (CVPR) 2021, 464-474, 2021 | 150 | 2021 |
Edgevits: Competing light-weight cnns on mobile devices with vision transformers J Pan, A Bulat, F Tan, X Zhu, L Dudziak, H Li, G Tzimiropoulos, B Martinez European Conference on Computer Vision (ECCV) 2022, 2022 | 149 | 2022 |
Online detection of action start in untrimmed, streaming videos J Pan*, Z Shou*, J Chan, K Miyazawa, H Mansour, A Vetro, X Giro-i Nieto, ... European Conference on Computer Vision (ECCV) 2018, 534-551, 2018 | 106* | 2018 |
Video Generation from Single Semantic Label Map J Pan, C Wang, X Jia, J Shao, L Sheng, J Yan, X Wang Computer Vision and Pattern Recognition (CVPR) 2019, 3733-3742, 2019 | 104 | 2019 |
Personalize segment anything model with one shot R Zhang, Z Jiang, Z Guo, S Yan, J Pan, X Ma, H Dong, P Gao, H Li arXiv preprint arXiv:2305.03048, 2023 | 101 | 2023 |
Videollm: Modeling video sequence with large language models G Chen, YD Zheng, J Wang, J Xu, Y Huang, J Pan, Y Wang, Y Wang, ... arXiv preprint arXiv:2305.13292, 2023 | 54 | 2023 |
Internvideo-ego4d: A pack of champion solutions to ego4d challenges G Chen, S Xing, Z Chen, Y Wang, K Li, Y Li, Y Liu, J Wang, YD Zheng, ... arXiv preprint arXiv:2211.09529, 2022 | 38 | 2022 |
Journeydb: A benchmark for generative image understanding K Sun, J Pan, Y Ge, H Li, H Duan, X Wu, R Zhang, A Zhou, Z Qin, Y Wang, ... Advances in Neural Information Processing Systems 36, 2024 | 25 | 2024 |
Retrieving-to-answer: Zero-shot video question answering with frozen large language models J Pan, Z Lin, Y Ge, X Zhu, R Zhang, Y Wang, Y Qiao, H Li Proceedings of the IEEE/CVF International Conference on Computer Vision, 272-283, 2023 | 13 | 2023 |
Measuring multimodal mathematical reasoning with math-vision dataset K Wang, J Pan, W Shi, Z Lu, M Zhan, H Li arXiv preprint arXiv:2402.14804, 2024 | 11 | 2024 |
High-Quality Video Generation from Static Structural Annotations L Sheng*, J Pan*, J Guo, J Shao, CC Loy International Journal of Computer Vision 128, 2552-2569, 2020, 2020 | 7 | 2020 |
Mathgenie: Generating synthetic data with question back-translation for enhancing mathematical reasoning of llms Z Lu, A Zhou, H Ren, K Wang, W Shi, J Pan, M Zhan, H Li arXiv preprint arXiv:2402.16352, 2024 | 6 | 2024 |
Lego: Language enhanced multi-modal grounding model Z Li, Q Xu, D Zhang, H Song, Y Cai, Q Qi, R Zhou, J Pan, Z Li, VT Vu, ... arXiv preprint arXiv:2401.06071, 2024 | 5 | 2024 |
Sparsemae: Sparse training meets masked autoencoders A Zhou, Y Li, Z Qin, J Liu, J Pan, R Zhang, R Zhao, P Gao, H Li Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 2 | 2023 |
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning Z Lu, A Zhou, K Wang, H Ren, W Shi, J Pan, M Zhan arXiv preprint arXiv:2407.00782, 2024 | | 2024 |
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation H Ren, M Zhan, Z Wu, A Zhou, J Pan, H Li arXiv preprint arXiv:2405.17057, 2024 | | 2024 |