Few-shot object detection: A survey
Deep learning approaches have recently raised the bar in many fields, from Natural
Language Processing to Computer Vision, by leveraging large amounts of data. However …
Language Processing to Computer Vision, by leveraging large amounts of data. However …
Survey on multi-output learning
The aim of multi-output learning is to simultaneously predict multiple outputs given an input.
It is an important learning problem for decision-making since making decisions in the real …
It is an important learning problem for decision-making since making decisions in the real …
Expanding language-image pretrained models for general video recognition
Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
Deep hierarchical semantic segmentation
Humans are able to recognize structured relations in observation, allowing us to decompose
complex scenes into simpler parts and abstract the visual world in multiple levels. However …
complex scenes into simpler parts and abstract the visual world in multiple levels. However …
Decoupling zero-shot semantic segmentation
Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …
Contrastive embedding for generalized zero-shot learning
Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and
unseen classes, when only the labeled examples from seen classes are provided. Recent …
unseen classes, when only the labeled examples from seen classes are provided. Recent …
[HTML][HTML] Combined scaling for zero-shot transfer learning
Recent developments in multimodal training methodologies, including CLIP and ALIGN,
obviate the necessity for individual data labeling. These approaches utilize pairs of data and …
obviate the necessity for individual data labeling. These approaches utilize pairs of data and …
Free: Feature refinement for generalized zero-shot learning
Generalized zero-shot learning (GZSL) has achieved significant progress, with many efforts
dedicated to overcoming the problems of visual-semantic domain gaps and seen-unseen …
dedicated to overcoming the problems of visual-semantic domain gaps and seen-unseen …
Clip2scene: Towards label-efficient 3d scene understanding by clip
Abstract Contrastive Language-Image Pre-training (CLIP) achieves promising results in 2D
zero-shot and few-shot learning. Despite the impressive performance in 2D, applying CLIP …
zero-shot and few-shot learning. Despite the impressive performance in 2D, applying CLIP …
Fine-tuned clip models are efficient video learners
Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …