Complex organ mask guided radiology report generation T Gu, D Liu, Z Li, W Cai WACV2024, 2024 | 18 | 2024 |
RWKV-CLIP: A Robust Vision-Language Representation Learner T Gu, K Yang, X An, Z Feng, D Liu, W Cai, J Deng EMNLP2024, 2024 | 8 | 2024 |
Clip-cid: Efficient clip distillation via cluster-instance discrimination K Yang, T Gu, X An, H Jiang, X Dai, Z Feng, W Cai, J Deng AAAI2025, 2024 | 3 | 2024 |
NICE: CVPR 2023 challenge on zero-shot image captioning T Kim, P Ahn, S Kim, S Lee, M Marsden, A Sala, SH Kim, B Han, KM Lee, ... CVPR2023 Workshop, 2024 | 3 | 2024 |
LaPA: Latent Prompt Assist Model For Medical Visual Question Answering T Gu, K Yang, D Liu, W Cai CVPR2024 Workshop, 2024 | 2 | 2024 |
Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension Y Xie, K Yang, N Yang, W Deng, X Dai, T Gu, Y Wang, X An, Y Zhao, ... arXiv preprint arXiv:2410.14332, 2024 | 1 | 2024 |
ORID: Organ-Regional Information Driven Framework for Radiology Report Generation T Gu, K Yang, X An, Z Feng, D Liu, W Cai WACV2025, 2025 | | 2025 |