Sun database: Large-scale scene recognition from abbey to zoo

SSA Zaidi, MS Ansari, A Aslam, N Kanwal… - Digital Signal …, 2022 - Elsevier

Object Detection is the task of classification and localization of objects in an image or video.
It has gained prominence in recent years due to its widespread applications. This article …

被引用次数：723 相关文章所有 8 个版本

[PDF] arxiv.org

Domain generalization: A survey

K Zhou, Z Liu, Y Qiao, T Xiang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet
challenging for machines to reproduce. This is because most learning algorithms strongly …

被引用次数：726 相关文章所有 9 个版本

[PDF] arxiv.org

Visual prompt tuning

M Jia, L Tang, BC Chen, C Cardie, S Belongie… - … on Computer Vision, 2022 - Springer

The current modus operandi in adapting pre-trained models involves updating all the
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …

被引用次数：972 相关文章所有 7 个版本

[PDF] thecvf.com

Conditional prompt learning for vision-language models

K Zhou, J Yang, CC Loy, Z Liu - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential
to investigate ways to adapt these models to downstream datasets. A recently proposed …

被引用次数：882 相关文章所有 9 个版本

[PDF] openreview.net

Unified-io: A unified model for vision, language, and multi-modal tasks

J Lu, C Clark, R Zellers, R Mottaghi… - The Eleventh …, 2022 - openreview.net

We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical
computer vision tasks, including pose estimation, object detection, depth estimation and …

被引用次数：282 相关文章所有 4 个版本

[PDF] mlr.press

Out-of-distribution detection with deep nearest neighbors

Y Sun, Y Ming, X Zhu, Y Li - International Conference on …, 2022 - proceedings.mlr.press

Abstract Out-of-distribution (OOD) detection is a critical task for deploying machine learning
models in the open world. Distance-based methods have demonstrated promise, where …

被引用次数：315 相关文章所有 4 个版本

[PDF] neurips.cc

Test-time prompt tuning for zero-shot generalization in vision-language models

M Shu, W Nie, DA Huang, Z Yu… - Advances in …, 2022 - proceedings.neurips.cc

Pre-trained vision-language models (eg, CLIP) have shown promising zero-shot
generalization in many downstream tasks with properly designed text prompts. Instead of …

被引用次数：158 相关文章所有 6 个版本

[PDF] mlr.press

Mitigating neural network overconfidence with logit normalization

H Wei, R Xie, H Cheng, L Feng… - … conference on machine …, 2022 - proceedings.mlr.press

Detecting out-of-distribution inputs is critical for the safe deployment of machine learning
models in the real world. However, neural networks are known to suffer from the …

被引用次数：181 相关文章所有 5 个版本

[PDF] arxiv.org

Slip: Self-supervision meets language-image pre-training

N Mu, A Kirillov, D Wagner, S Xie - European conference on computer …, 2022 - Springer

Recent work has shown that self-supervised pre-training leads to improvements over
supervised learning on challenging visual recognition tasks. CLIP, an exciting new …

被引用次数：331 相关文章所有 9 个版本

[PDF] arxiv.org

Learning to prompt for vision-language models

K Zhou, J Yang, CC Loy, Z Liu - International Journal of Computer Vision, 2022 - Springer

Large pre-trained vision-language models like CLIP have shown great potential in learning
representations that are transferable across a wide range of downstream tasks. Different …

被引用次数：1410 相关文章所有 10 个版本