Cdul: Clip-driven unsupervised learning for multi-label image classification
This paper presents a CLIP-based unsupervised learning method for annotation-free multi-
label image classification, including three stages: initialization, training, and inference. At the …
label image classification, including three stages: initialization, training, and inference. At the …
Texts as images in prompt tuning for multi-label image recognition
Prompt tuning has been employed as an efficient way to adapt large vision-language pre-
trained models (eg CLIP) to various downstream tasks in data-limited or label-limited …
trained models (eg CLIP) to various downstream tasks in data-limited or label-limited …
Exploring structured semantic prior for multi label recognition with incomplete labels
Multi-label recognition (MLR) with incomplete labels is very challenging. Recent works strive
to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to …
to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to …
Bridging the gap between model explanations in partially annotated multi-label classification
Due to the expensive costs of collecting labels in multi-label classification datasets, partially
annotated multi-label classification has become an emerging field in computer vision. One …
annotated multi-label classification has become an emerging field in computer vision. One …
Dualcoop++: Fast and effective adaptation to multi-label recognition with limited annotations
Multi-label image recognition in the low-label regime is a task of great challenge and
practical significance. Previous works have focused on learning the alignment between …
practical significance. Previous works have focused on learning the alignment between …
Spatial-temporal knowledge-embedded transformer for video scene graph generation
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer
their relationships for a given video. It requires not only a comprehensive understanding of …
their relationships for a given video. It requires not only a comprehensive understanding of …
Ingredient prediction via context learning network with class-adaptive asymmetric loss
Ingredient prediction has received more and more attention with the help of image
processing for its diverse real-world applications, such as nutrition intake management and …
processing for its diverse real-world applications, such as nutrition intake management and …
Saliency Regularization for Self-Training with Partial Annotations
S Wang, Q Wan, X Xiang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Partially annotated images are easy to obtain in multi-label classification. However,
unknown labels in partially annotated images exacerbate the positive-negative imbalance …
unknown labels in partially annotated images exacerbate the positive-negative imbalance …
Generating diverse augmented attributes for generalized zero shot learning
Abstract Generalized Zero-Shot Learning (GZSL) has become an important research due to
its powerful ability of recognizing unseen objects. Generative methods, converting …
its powerful ability of recognizing unseen objects. Generative methods, converting …
Positive label is all you need for multi-label classification
Z Yuan, K Zhang, T Huang - arXiv preprint arXiv:2306.16016, 2023 - arxiv.org
Multi-label classification (MLC) suffers from the inevitable label noise in training data due to
the difficulty in annotating various semantic labels in each image. To mitigate the influence …
the difficulty in annotating various semantic labels in each image. To mitigate the influence …