Coca: Contrastive captioners are image-text foundation models J Yu, Z Wang, V Vasudevan, L Yeung, M Seyedhosseini, Y Wu Transactions on Machine Learning Research, 2022 | 1029 | 2022 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 1012 | 2023 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 843 | 2022 |
Scaling autoregressive models for content-rich text-to-image generation J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang, V Vasudevan, A Ku, Y Yang, ... Transactions on Machine Learning Research, 2022 | 767 | 2022 |
Simvlm: Simple visual language model pretraining with weak supervision Z Wang, J Yu, AW Yu, Z Dai, Y Tsvetkov, Y Cao ICLR 2022, 2022 | 692 | 2022 |
Characterizing and avoiding negative transfer Z Wang, Z Dai, B Póczos, J Carbonell CVPR 2019, 2019 | 510 | 2019 |
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models Z Wang, Y Tsvetkov, O Firat, Y Cao ICLR 2021, 2021 | 159 | 2021 |
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment Z Wang, ZC Lipton, Y Tsvetkov EMNLP 2020, 2020 | 113 | 2020 |
Ferret: Refer and ground anything anywhere at any granularity H You, H Zhang, Z Gan, X Du, B Zhang, Z Wang, L Cao, SF Chang, ... arXiv preprint arXiv:2310.07704, 2023 | 98 | 2023 |
Cross-lingual alignment vs joint training: A comparative study and a simple unified framework Z Wang, J Xie, R Xu, Y Yang, G Neubig, J Carbonell ICLR 2020, 2020 | 76 | 2020 |
Towards zero-label language learning Z Wang, AW Yu, O Firat, Y Cao arXiv preprint arXiv:2109.09193, 2021 | 75 | 2021 |
VideoCoCa: Video-text modeling with zero-shot transfer from contrastive captioners S Yan, T Zhu, Z Wang, Y Cao, M Zhang, S Ghosh, Y Wu, J Yu arXiv preprint arXiv:2212.04979, 2022 | 60 | 2022 |
Efficient Meta Lifelong-Learning with Limited Memory Z Wang, SV Mehta, B Póczos, J Carbonell EMNLP 2020, 2020 | 56 | 2020 |
Mm1: Methods, analysis & insights from multimodal llm pre-training B McKinzie, Z Gan, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ... arXiv preprint arXiv:2403.09611, 2024 | 52 | 2024 |
Reveal: Retrieval-augmented visual-language pre-training with multi-source multimodal knowledge memory Z Hu, A Iscen, C Sun, Z Wang, KW Chang, Y Sun, C Schmid, DA Ross, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 48 | 2023 |
Coca: Contrastive captioners are image-text foundation models. arXiv 2022 J Yu, Z Wang, V Vasudevan, L Yeung, M Seyedhosseini, Y Wu arXiv preprint arXiv:2205.01917, 0 | 36 | |
Scaling autoregressive models for content-rich text-to-image generation. arXiv 2022 J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang, V Vasudevan, A Ku, Y Yang, ... arXiv preprint arXiv:2206.10789, 2018 | 24 | 2018 |
Medblip: Bootstrapping language-image pre-training from 3d medical images and texts Q Chen, X Hu, Z Wang, Y Hong arXiv preprint arXiv:2305.10799, 2023 | 15 | 2023 |
Theoretical guarantees of transfer learning Z Wang arXiv preprint arXiv:1810.05986, 2018 | 15 | 2018 |
Towards more Reliable Transfer Learning Z Wang, J Carbonell ECML-PKDD 2018, 2018 | 12 | 2018 |