DINO: Detr with improved denoising anchor boxes for end-to-end object detection H Zhang, F Li, S Liu, L Zhang, H Su, J Zhu, LM Ni, HY Shum arXiv preprint arXiv:2203.03605, 2022 | 868 | 2022 |
Grounding dino: Marrying dino with grounded pre-training for open-set object detection S Liu, Z Zeng, T Ren, F Li, H Zhang, J Yang, C Li, J Yang, H Su, J Zhu, ... arXiv preprint arXiv:2303.05499, 2023 | 706 | 2023 |
Dab-detr: Dynamic anchor boxes are better queries for detr S Liu, F Li, H Zhang, X Yang, X Qi, H Su, J Zhu, L Zhang arXiv preprint arXiv:2201.12329, 2022 | 541 | 2022 |
DN-DETR: Accelerate detr training by introducing query denoising F Li, H Zhang, S Liu, J Guo, LM Ni, L Zhang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 477 | 2022 |
Mask dino: Towards a unified transformer-based framework for object detection and segmentation F Li, H Zhang, H Xu, S Liu, L Zhang, LM Ni, HY Shum Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 243 | 2023 |
Query2label: A simple transformer way to multi-label classification S Liu, L Zhang, X Yang, H Su, J Zhu arXiv preprint arXiv:2107.10834, 2021 | 173 | 2021 |
Recognize anything: A strong image tagging model Y Zhang, X Huang, J Ma, Z Li, Z Luo, Y Xie, Y Qin, T Luo, Y Li, S Liu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 90 | 2024 |
A simple framework for open-vocabulary segmentation and detection H Zhang, F Li, X Zou, S Liu, C Li, J Yang, L Zhang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 88 | 2023 |
Semantic-sam: Segment and recognize anything at any granularity F Li, H Zhang, P Sun, X Zou, S Liu, J Yang, C Li, L Zhang, J Gao arXiv preprint arXiv:2307.04767, 2023 | 85 | 2023 |
Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv 2022 H Zhang, F Li, S Liu, L Zhang, H Su, J Zhu, LM Ni, HY Shum arXiv preprint arXiv:2203.03605 5, 2022 | 54 | 2022 |
Grounded sam: Assembling open-world models for diverse visual tasks T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ... arXiv preprint arXiv:2401.14159, 2024 | 46 | 2024 |
Llava-plus: Learning to use tools for creating multimodal agents S Liu, H Cheng, H Liu, H Zhang, F Li, T Ren, X Zou, J Yang, H Su, J Zhu, ... arXiv preprint arXiv:2311.05437, 2023 | 46 | 2023 |
Explicit box detection unifies end-to-end multi-person pose estimation J Yang, A Zeng, S Liu, F Li, R Zhang, L Zhang arXiv preprint arXiv:2302.01593, 2023 | 44 | 2023 |
Lite DETR: An interleaved multi-scale encoder for efficient detr F Li, A Zeng, S Liu, H Zhang, H Li, L Zhang, LM Ni Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 39 | 2023 |
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition S Liu, L Qi, H Qin, J Shi, J Jia | 38 | 2018 |
Vision-language intelligence: Tasks, representation learning, and large models F Li, H Zhang, YF Zhang, S Liu, J Guo, LM Ni, PC Zhang, L Zhang arXiv preprint arXiv:2203.01922, 2022 | 33 | 2022 |
Mp-former: Mask-piloted transformer for image segmentation H Zhang, F Li, H Xu, S Huang, S Liu, LM Ni, L Zhang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 28 | 2023 |
Unsupervised part segmentation through disentangling appearance and shape S Liu, L Zhang, X Yang, H Su, J Zhu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 28 | 2021 |
Detection transformer with stable matching S Liu, T Ren, J Chen, Z Zeng, H Zhang, F Li, H Li, J Huang, H Su, J Zhu, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 21 | 2023 |
Dino: Detr with improved denoising anchor boxes for end-to-end object detection (2022) H Zhang, F Li, S Liu, L Zhang, H Su, J Zhu, LM Ni, H Shum arXiv preprint arXiv:2203.03605, 0 | 15 | |