Maple: Multi-modal prompt learning MU Khattak, H Rasheed, M Maaz, S Khan, FS Khan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 374 | 2023 |
Video-chatgpt: Towards detailed video understanding via large vision and language models M Maaz, H Rasheed, S Khan, FS Khan arXiv preprint arXiv:2306.05424, 2023 | 256 | 2023 |
Bridging the gap between object and image-level representations for open-vocabulary detection H Bangalath, M Maaz, MU Khattak, SH Khan, F Shahbaz Khan Advances in Neural Information Processing Systems 35, 33781-33794, 2022 | 123 | 2022 |
Fine-tuned clip models are efficient video learners H Rasheed, MU Khattak, M Maaz, S Khan, FS Khan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 87 | 2023 |
Class-agnostic object detection with multi-modal transformer M Maaz, H Rasheed, S Khan, FS Khan, RM Anwer, MH Yang European conference on computer vision, 512-531, 2022 | 85* | 2022 |
UNETR++: delving into efficient and accurate 3D medical image segmentation AM Shaker, M Maaz, H Rasheed, S Khan, MH Yang, FS Khan IEEE Transactions on Medical Imaging, 2024 | 67 | 2024 |
Glamm: Pixel grounding large multimodal model H Rasheed, M Maaz, S Shaji, A Shaker, S Khan, H Cholakkal, RM Anwer, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 56 | 2024 |
Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications A Shaker, M Maaz, H Rasheed, S Khan, MH Yang, FS Khan Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 31 | 2023 |
Pg-video-llava: Pixel grounding large video-language models S Munasinghe, R Thushara, M Maaz, HA Rasheed, S Khan, M Shah, ... arXiv preprint arXiv:2311.13435, 2023 | 14 | 2023 |
PALO: A Polyglot Large Multimodal Model for 5B People M Maaz, H Rasheed, A Shaker, S Khan, H Cholakal, RM Anwer, ... arXiv preprint arXiv:2402.14818, 2024 | 3 | 2024 |
Self-supervised learning for fine-grained visual categorization M Maaz, HA Rasheed, D Gaddam arXiv preprint arXiv:2105.08788, 2021 | 2 | 2021 |
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding M Maaz, H Rasheed, S Khan, F Khan arXiv preprint arXiv:2406.09418, 2024 | 1 | 2024 |
A System For Analyzing Milk Composition By a Reflection Probe T Francis, M Shankara, H Rasheed, D Joy IN Patent 19/2,021, 2021 | | 2021 |
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation ASM Maaz, H Rasheed, S Khan, MH Yang, FS Khan | | |