DINO: Detr with improved denoising anchor boxes for end-to-end object detection H Zhang*, F Li*, S Liu*, L Zhang, H Su, J Zhu, LM Ni, HY Shum International Conference on Learning Representations (ICLR), 2023, 2022 | 910 | 2022 |
Grounding dino: Marrying dino with grounded pre-training for open-set object detection S Liu, Z Zeng, T Ren, F Li, H Zhang, J Yang, C Li, J Yang, H Su, J Zhu, ... ECCV 2024, 2023 | 737 | 2023 |
DAB-DETR: Dynamic anchor boxes are better queries for DETR S Liu, F Li, H Zhang, X Yang, X Qi, H Su, J Zhu, L Zhang International Conference on Learning Representations (ICLR), 2022, 2022 | 549 | 2022 |
Dn-detr: Accelerate detr training by introducing query denoising F Li*, H Zhang*, S Liu, J Guo, LM Ni, L Zhang The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR …, 2022 | 481 | 2022 |
Segment everything everywhere all at once X Zou*, J Yang*, H Zhang*, F Li*, L Li, J Gao, YJ Lee NeurIPS 2023, 2023 | 297 | 2023 |
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation F Li*, H Zhang*, S Liu, L Zhang, LM Ni, HY Shum The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023, 2022 | 241 | 2022 |
A simple framework for open-vocabulary segmentation and detection H Zhang*, F Li*, X Zou, S Liu, C Li, J Gao, J Yang, L Zhang ICCV 2023, 2023 | 91 | 2023 |
Semantic-SAM: Segment and Recognize Anything at Any Granularity F Li*, H Zhang*, P Sun, X Zou, S Liu, J Yang, C Li, L Zhang, J Gao ECCV 2024, 2023 | 87 | 2023 |
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V J Yang*, H Zhang*, F Li*, X Zou*, C Li, J Gao arXiv preprint arXiv:2310.11441, 2023 | 78 | 2023 |
Grounded sam: Assembling open-world models for diverse visual tasks T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ... arXiv preprint arXiv:2401.14159, 2024 | 53 | 2024 |
Llava-plus: Learning to use tools for creating multimodal agents S Liu, H Cheng, H Liu, H Zhang, F Li, T Ren, X Zou, J Yang, H Su, J Zhu, ... ECCV 2024, 2023 | 51 | 2023 |
Lite DETR: An Interleaved Multi-Scale Encoder for Efficient DETR F Li, A Zeng, S Liu, H Zhang, H Li, L Zhang, LM Ni CVPR 2023, 2023 | 43 | 2023 |
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models F Li*, H Zhang*, YF Zhang, S Liu, J Guo, LM Ni, PC Zhang, L Zhang arXiv preprint arXiv:2203.01922, 2022 | 34 | 2022 |
MP-Former: Mask-Piloted Transformer for Image Segmentation H Zhang, F Li, H Xu, S Huang, S Liu, LM Ni, L Zhang The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023, 2023 | 29 | 2023 |
Detection Transformer with Stable Matching S Liu, T Ren, J Chen, Z Zeng, H Zhang, F Li, H Li, J Huang, H Su, J Zhu, ... ICCV 2023, 2023 | 22 | 2023 |
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models H Zhang*, H Li*, F Li, T Ren, X Zou, S Liu, S Huang, J Gao, L Zhang, C Li, ... ECCV 2024, 2023 | 15 | 2023 |
Multi-relation message passing for multi-label text classification M Ozmen, H Zhang, P Wang, M Coates ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 14 | 2022 |
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding S Liu, Y Liang, F Li, S Huang, H Zhang, H Su, J Zhu, L Zhang AAAI 2023, 2022 | 11 | 2022 |
detrex: Benchmarking Detection Transformers T Ren*, S Liu*, F Li*, H Zhang*, A Zeng, J Yang, X Liao, D Jia, H Li, H Cao, ... arXiv preprint arXiv:2306.07265, 2023 | 8 | 2023 |
Introducing Depth into Transformer-based 3D Object Detection H Zhang, H Li, A Zeng, F Li, S Liu, X Liao, L Zhang arXiv preprint arXiv:2302.13002, 2023 | 8* | 2023 |