VOS: Learning What You Don't Know by Virtual Outlier Synthesis X Du, Z Wang, M Cai, Y Li Proceedings of the International Conference on Learning Representations 1 (4), 8, 2022 | 237 | 2022 |
Masked Discrimination for Self-Supervised Learning on Point Clouds H Liu, M Cai, YJ Lee Proceedings of the European Conference on Computer Vision (ECCV), 2022, 2022 | 116 | 2022 |
Frequency domain image translation: More photo-realistic, better identity-preserving M Cai, H Zhang, H Huang, Q Geng, Y Li, G Huang IEEE International Conference on Computer Vision (ICCV), 2021, 13930-13940, 2021 | 70 | 2021 |
Investigating the catastrophic forgetting in multimodal large language models Y Zhai, S Tong, X Li, M Cai, Q Qu, YJ Lee, Y Ma Conference on Parsimony and Learning (CPAL) 2023, 2023 | 62* | 2023 |
Out-of-distribution Detection via Frequency-regularized Generative Models M Cai, Y Li WACV (Spotlight), 2023, 2022 | 28 | 2022 |
Making large multimodal models understand arbitrary visual prompts M Cai, H Liu, SK Mustikovela, GP Meyer, Y Chai, D Park, YJ Lee CVPR 2024, 2024 | 24 | 2024 |
A Game-Theoretic Strategy-Aware Interaction Algorithm with Validation on Real Traffic Data L Sun*, M Cai*, W Zhan, M Tomizuka The 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2020 | 15 | 2020 |
Llava-prumerge: Adaptive token reduction for efficient large multimodal models Y Shang, M Cai, B Xu, YJ Lee, Y Yan arXiv preprint arXiv:2403.15388, 2024 | 6 | 2024 |
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance Z Huang, A Zhou, Z Lin, M Cai, H Wang, YJ Lee ICCV 2023, 2023 | 6 | 2023 |
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding M Cai, Z Huang, Y Li, H Wang, YJ Lee arXiv preprint arXiv:2306.06094, 2023 | 5 | 2023 |
Matryoshka Multimodal Models M Cai, J Yang, J Gao, YJ Lee arXiv preprint arXiv:2405.17430, 2024 | 2 | 2024 |
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy X Li, C Mata, J Park, K Kahatapitiya, YS Jang, J Shang, K Ranasinghe, ... arXiv preprint arXiv:2406.20095, 2024 | | 2024 |
Yo'LLaVA: Your Personalized Language and Vision Assistant T Nguyen, H Liu, Y Li, M Cai, U Ojha, YJ Lee arXiv preprint arXiv:2406.09400, 2024 | | 2024 |
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples J Zhang*, M Cai*, T Xie, YJ Lee Findings of the Association for Computational Linguistics: ACL 2024, 2024 | | 2024 |
Cross-Modal Self-Supervised Learning with Effective Contrastive Units for Point Clouds M Cai, C Luo, YJ Lee, X Yang IROS 2024, 2024 | | 2024 |
Causal inference can prevent computer vision from falling into black-box deep learning M Cai | | |