Llava-uhd: an lmm perceiving any aspect ratio and high-resolution images R Xu, Y Yao, Z Guo, J Cui, Z Ni, C Ge, TS Chua, Z Liu, M Sun, G Huang arXiv preprint arXiv:2403.11703, 2024 | 29 | 2024 |
GUICourse: From General Vision Language Models to Versatile GUI Agents W Chen, J Cui, J Hu, Y Qin, J Fang, Y Zhao, C Wang, J Liu, G Chen, ... arXiv preprint arXiv:2406.11317, 2024 | | 2024 |