Mimic-it: Multi-modal in-context instruction tuning B Li, Y Zhang, L Chen, J Wang, F Pu, J Yang, C Li, Z Liu arXiv preprint arXiv:2306.05425, 2023 | 417 | 2023 |
Pair then relation: Pair-net for panoptic scene graph generation J Wang, Z Wen, X Li, Z Guo, J Yang, Z Liu arXiv preprint arXiv:2307.08699, 2023 | 3 | 2023 |
TransPatch: a transformer-based generator for accelerating transferable patch generation in adversarial attacks against object detection models J Wang, C Cui, X Wen, J Shi European Conference on Computer Vision, 317-331, 2022 | 1 | 2022 |
AID: Attention Interpolation of Text-to-Image Diffusion Q He, J Wang, Z Liu, A Yao arXiv preprint arXiv:2403.17924, 2024 | | 2024 |
Otter: A multi-modal model with in-context instruction tuning B Li, Y Zhang, L Chen, J Wang, J Yang, Z Liu arXiv preprint arXiv:2305.03726, 2023 | | 2023 |