相关文章- 学术资源搜索

Non-contrastive learning meets language-image pre-training

J Zhou, L Dong, Z Gan, L Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Contrastive language-image pre-training (CLIP) serves as a de-facto standard to align
images and texts. Nonetheless, the loose correlation between images and texts of web …

被引用次数：23 相关文章所有 5 个版本

[PDF] aaai.org

Calip: Zero-shot enhancement of clip with parameter-free attention

Z Guo, R Zhang, L Qiu, X Ma, X Miao, X He… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Abstract Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual
representations with promising zero-shot performance. To further improve its downstream …

被引用次数：109 相关文章所有 4 个版本

[PDF] thecvf.com

Filtering, distillation, and hard negatives for vision-language pre-training

F Radenovic, A Dubey, A Kadian… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision-language models trained with contrastive learning on large-scale noisy data are
becoming increasingly popular for zero-shot recognition problems. In this paper we improve …

被引用次数：81 相关文章所有 9 个版本

[PDF] thecvf.com

Sus-x: Training-free name-only transfer of vision-language models

V Udandarao, A Gupta… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) has emerged as a simple yet
effective way to train large-scale vision-language models. CLIP demonstrates impressive …

被引用次数：94 相关文章所有 5 个版本

[PDF] thecvf.com

Ra-clip: Retrieval augmented contrastive language-image pre-training

CW Xie, S Sun, X Xiong, Y Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) is attracting increasing attention
for its impressive zero-shot recognition performance on different down-stream tasks …

被引用次数：34 相关文章所有 5 个版本

[PDF] arxiv.org

Long-clip: Unlocking the long-text capability of clip

B Zhang, P Zhang, X Dong, Y Zang, J Wang - European Conference on …, 2025 - Springer

Abstract Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-
shot classification, text-image retrieval, and text-image generation by aligning image and …

被引用次数：75 相关文章所有 2 个版本

[PDF] arxiv.org

Supervision exists everywhere: A data efficient contrastive language-image pre-training paradigm

Y Li, F Liang, L Zhao, Y Cui, W Ouyang, J Shao… - arXiv preprint arXiv …, 2021 - arxiv.org

Recently, large-scale Contrastive Language-Image Pre-training (CLIP) has attracted
unprecedented attention for its impressive zero-shot recognition ability and excellent …

被引用次数：477 相关文章所有 3 个版本

[PDF] thecvf.com

Unified contrastive learning in image-text-label space

J Yang, C Li, P Zhang, B Xiao, C Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Visual recognition is recently learned via either supervised learning on human-annotated
image-label data or language-image contrastive learning with webly-crawled image-text …

被引用次数：219 相关文章所有 5 个版本

[PDF] arxiv.org

Chinese clip: Contrastive vision-language pretraining in chinese

A Yang, J Pan, J Lin, R Men, Y Zhang, J Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org

The tremendous success of CLIP (Radford et al., 2021) has promoted the research and
application of contrastive learning for vision-language pretraining. In this work, we construct …

被引用次数：121 相关文章所有 2 个版本

[PDF] thecvf.com

Reclip: Refine contrastive language image pre-training with source free domain adaptation

X Hu, K Zhang, L Xia, A Chen, J Luo… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large-scale pre-training vision-language models (VLM) such as CLIP has demonstrated
outstanding performance in zero-shot classification, eg achieving 76.3% top-1 accuracy on …

被引用次数：23 相关文章所有 10 个版本