Evaluation of output embeddings for fine-grained image classification

S Antonelli, D Avola, L Cinque, D Crisostomi… - ACM Computing …, 2022 - dl.acm.org

Deep learning approaches have recently raised the bar in many fields, from Natural
Language Processing to Computer Vision, by leveraging large amounts of data. However …

被引用次数：65 相关文章所有 5 个版本

[PDF] arxiv.org

Survey on multi-output learning

D Xu, Y Shi, IW Tsang, YS Ong… - IEEE transactions on …, 2019 - ieeexplore.ieee.org

The aim of multi-output learning is to simultaneously predict multiple outputs given an input.
It is an important learning problem for decision-making since making decisions in the real …

被引用次数：257 相关文章所有 9 个版本

[PDF] arxiv.org

Expanding language-image pretrained models for general video recognition

B Ni, H Peng, M Chen, S Zhang, G Meng, J Fu… - … on Computer Vision, 2022 - Springer

Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …

被引用次数：206 相关文章所有 6 个版本

[PDF] thecvf.com

Deep hierarchical semantic segmentation

L Li, T Zhou, W Wang, J Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Humans are able to recognize structured relations in observation, allowing us to decompose
complex scenes into simpler parts and abstract the visual world in multiple levels. However …

被引用次数：126 相关文章所有 9 个版本

[PDF] thecvf.com

Decoupling zero-shot semantic segmentation

J Ding, N Xue, GS Xia, D Dai - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com

Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …

被引用次数：185 相关文章所有 10 个版本

[PDF] thecvf.com

Contrastive embedding for generalized zero-shot learning

Z Han, Z Fu, S Chen, J Yang - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and
unseen classes, when only the labeled examples from seen classes are provided. Recent …

被引用次数：213 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Combined scaling for zero-shot transfer learning

H Pham, Z Dai, G Ghiasi, K Kawaguchi, H Liu, AW Yu… - Neurocomputing, 2023 - Elsevier

Recent developments in multimodal training methodologies, including CLIP and ALIGN,
obviate the necessity for individual data labeling. These approaches utilize pairs of data and …

被引用次数：142 相关文章所有 5 个版本

[PDF] thecvf.com

Free: Feature refinement for generalized zero-shot learning

S Chen, W Wang, B Xia, Q Peng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Generalized zero-shot learning (GZSL) has achieved significant progress, with many efforts
dedicated to overcoming the problems of visual-semantic domain gaps and seen-unseen …

被引用次数：159 相关文章所有 10 个版本

[PDF] thecvf.com

Clip2scene: Towards label-efficient 3d scene understanding by clip

R Chen, Y Liu, L Kong, X Zhu, Y Ma… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) achieves promising results in 2D
zero-shot and few-shot learning. Despite the impressive performance in 2D, applying CLIP …

被引用次数：86 相关文章所有 6 个版本

[PDF] thecvf.com

Fine-tuned clip models are efficient video learners

H Rasheed, MU Khattak, M Maaz… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …

被引用次数：76 相关文章所有 7 个版本