Gpt4tools: Teaching large language model to use tools via self-instruction R Yang, L Song, Y Li, S Zhao, Y Ge, X Li, Y Shan Advances in Neural Information Processing Systems 36, 2024 | 91 | 2024 |
Making llama see and draw with seed tokenizer Y Ge, S Zhao, Z Zeng, Y Ge, C Li, X Wang, Y Shan arXiv preprint arXiv:2310.01218, 2023 | 34 | 2023 |
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition X Ding, Y Zhang, Y Ge, S Zhao, L Song, X Yue, Y Shan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 30 | 2024 |
Distribution-aware adaptive multi-bit quantization S Zhao, T Yue, X Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 29 | 2021 |
Vl-gpt: A generative pre-trained transformer for vision and language understanding and generation J Zhu, X Ding, Y Ge, Y Ge, S Zhao, H Zhao, X Wang, Y Shan arXiv preprint arXiv:2312.09251, 2023 | 15 | 2023 |
Seed-x: Multimodal models with unified multi-granularity comprehension and generation Y Ge, S Zhao, J Zhu, Y Ge, K Yi, L Song, C Li, X Ding, Y Shan arXiv preprint arXiv:2404.14396, 2024 | 8 | 2024 |
Fisher information guidance for learned time-of-flight imaging J Li, T Yue, S Zhao, X Hu Proceedings of the ieee/cvf conference on computer vision and pattern …, 2022 | 3 | 2022 |
Sticker820k: Empowering interactive retrieval with stickers S Zhao, Y Ge, Z Qi, L Song, X Ding, Z Xie, Y Shan arXiv preprint arXiv:2306.06870, 2023 | 2 | 2023 |
CV-VAE: A Compatible Video VAE for Latent Generative Video Models S Zhao, Y Zhang, X Cun, S Yang, M Niu, X Li, W Hu, Y Shan arXiv preprint arXiv:2405.20279, 2024 | | 2024 |
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing Y Ge, S Zhao, C Li, Y Ge, Y Shan arXiv preprint arXiv:2405.04007, 2024 | | 2024 |