Turbo: Informativity-driven acceleration plug-in for vision-language models

文章

学术资源搜索

获得 3 条结果（用时0.02秒）

我的图书馆

Turbo: Informativity-driven acceleration plug-in for vision-language models

在引用文章中搜索

[PDF] thecvf.com

Audio-Visual Segmentation via Unlabeled Frame Exploitation

J Liu, Y Liu, F Zhang, C Ju… - Proceedings of the …, 2024 - openaccess.thecvf.com

Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames.
Although great progress has been witnessed we experimentally reveal that current methods …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Lhrs-bot: Empowering remote sensing with vgi-enhanced large multimodal language model

D Muhtar, Z Li, F Gu, X Zhang, P Xiao - arXiv preprint arXiv:2402.02544, 2024 - arxiv.org

The revolutionary capabilities of large language models (LLMs) have paved the way for
multimodal large language models (MLLMs) and fostered diverse applications across …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition

H Cheng, C Ju, H Wang, J Liu, M Chen, Q Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

As one of the fundamental video tasks in computer vision, Open-Vocabulary Action
Recognition (OVAR) recently gains increasing attention, with the development of vision …

被引用次数：1 相关文章所有 2 个版本