Correlation information bottleneck: Towards adapting pretrained multimodal models for robust...

J Jiang, N Zheng - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Recently, finetuning pretrained vision-language models (VLMs) has been a prevailing
paradigm for achieving state-of-the-art performance in VQA. However, as VLMs scale, it …

被引用次数：15 相关文章所有 5 个版本

Vision-language alignment learning under affinity and divergence principles for few-shot out-of-distribution generalization

L Zhu, W Yin, Y Yang, F Wu, Z Zeng, Q Gu… - International Journal of …, 2024 - Springer

Recent advances in fine-tuning large-scale vision-language pre-trained models (VL-PTMs)
have shown promising results in quick adaption to downstream tasks. However, prior …

被引用次数：2 相关文章

[PDF] thecvf.com

Unseen And Adverse Outdoor Scenes Recognition Through Event-based Captions

H Sakaino - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

This paper presents EventCAP, ie, event-based captions, for refined and enriched
qualitative and quantitative captions by Deep Learning (DL) models and Vision Language …

被引用次数：1 相关文章所有 3 个版本

Measuring scientific inquiry ability related to hands-on practice: An automated approach based on multimodal data analysis

Y Song, L Guo, Q Zheng - Education and Information Technologies, 2024 - Springer

Scientific inquiry ability is closely related to the process of hands-on inquiry practice.
However, its assessment is often separated from this practice due to the limitation of …

[PDF] arxiv.org

CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering

Y Jiang, J Yin - arXiv preprint arXiv:2405.07451, 2024 - arxiv.org

While vision-language pretrained models (VLMs) excel in various multimodal understanding
tasks, their potential in fine-grained audio-visual reasoning, particularly for audio-visual …

[PDF][PDF] MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering Supplemental Material

J Jiang, N Zheng - openaccess.thecvf.com

The document provides some supplementary materials for our experiments. Specifically, in
Sec. 1, we explore the impact of different routing mechanisms and hyperparameters on …