Beyond sole strength: Customized ensembles for generalized vision-language models

B Li, X Li, H Zhu, Y Jin, R Feng… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Generative Adversarial Networks (GANs) have been widely used to recover vivid
textures in image super-resolution (SR) tasks. In particular one discriminator is utilized to …

被引用次数：16 相关文章所有 3 个版本

[PDF] thecvf.com

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

J Bai, K Gao, S Min, ST Xia, Z Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Contrastive Vision-Language Pre-training known as CLIP has shown promising
effectiveness in addressing downstream image recognition tasks. However recent works …

被引用次数：21 相关文章所有 5 个版本

[PDF] thecvf.com

Not all prompts are secure: A switchable backdoor attack against pre-trained vision transfomers

S Yang, J Bai, K Gao, Y Yang, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Given the power of vision transformers a new learning paradigm pre-training and then
prompting makes it more efficient and effective to address downstream visual recognition …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Parameter-efficient and memory-efficient tuning for vision transformer: a disentangled approach

T Zhang, J Bai, Z Lu, D Lian, G Wang, X Wang… - … on Computer Vision, 2025 - Springer

Recent works on parameter-efficient transfer learning (PETL) show the potential to adapt a
pre-trained Vision Transformer to downstream recognition tasks with only a few learnable …

被引用次数：2 相关文章所有 8 个版本

[HTML] mdpi.com

[HTML][HTML] Few-Shot Image Classification of Crop Diseases Based on Vision–Language Models

Y Zhou, H Yan, K Ding, T Cai, Y Zhang - Sensors, 2024 - mdpi.com

Accurate crop disease classification is crucial for ensuring food security and enhancing
agricultural productivity. However, the existing crop disease classification algorithms …

被引用次数：2 相关文章

[PDF] arxiv.org

Boostadapter: Improving test-time adaptation via regional bootstrapping

T Zhang, J Wang, H Guo, T Dai, B Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Adaptation of pretrained vision-language models such as CLIP to various downstream tasks
have raised great interest in recent researches. Previous works have proposed a variety of …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model

X Wang, Y Yang, M Zhu, K Zheng, S Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in pre-trained Vision-Language Models (VLMs) have highlighted the
significant potential of prompt tuning for adapting these models to a wide range of …

BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping

T Zhang, J Wang, H Guo, T Dai, B Chen… - The Thirty-eighth Annual … - openreview.net

Adaptation of pretrained vision-language models such as CLIP to various downstream tasks
have raised great interest in recent researches. Previous works have proposed a variety of …