[HTML][HTML] Large-scale multi-modal pre-trained models: A comprehensive survey
With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …
[HTML][HTML] Integrating machine learning with human knowledge
Machine learning has been heavily researched and widely used in many disciplines.
However, achieving high accuracy requires a large amount of data that is sometimes …
However, achieving high accuracy requires a large amount of data that is sometimes …
General multi-label image classification with transformers
Multi-label image classification is the task of predicting a set of labels corresponding to
objects, attributes or other entities present in an image. In this work we propose the …
objects, attributes or other entities present in an image. In this work we propose the …
Mos: Towards scaling out-of-distribution detection for large semantic space
R Huang, Y Li - Proceedings of the IEEE/CVF Conference …, 2021 - openaccess.thecvf.com
Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying
machine learning models in the real world. Existing solutions are mainly driven by small …
machine learning models in the real world. Existing solutions are mainly driven by small …
A survey of zero-shot learning: Settings, methods, and applications
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …
been seen in training. In practice, many applications require classifying instances whose …
Logic-induced diagnostic reasoning for semi-supervised semantic segmentation
Recent advances in semi-supervised semantic segmentation have been heavily reliant on
pseudo labeling to compensate for limited labeled data, disregarding the valuable relational …
pseudo labeling to compensate for limited labeled data, disregarding the valuable relational …
Zero-shot recognition via semantic embeddings and knowledge graphs
We consider the problem of zero-shot recognition: learning a visual classifier for a category
with zero training examples, just using the word embedding of the category and its …
with zero training examples, just using the word embedding of the category and its …
Knowledge-embedded routing network for scene graph generation
To understand a scene in depth not only involves locating/recognizing individual objects, but
also requires to infer the relationships and interactions among them. However, since the …
also requires to infer the relationships and interactions among them. However, since the …
Optimization methods for large-scale machine learning
This paper provides a review and commentary on the past, present, and future of numerical
optimization algorithms in the context of machine learning applications. Through case …
optimization algorithms in the context of machine learning applications. Through case …
Consensus-aware visual-semantic embedding for image-text matching
Image-text matching plays a central role in bridging vision and language. Most existing
approaches only rely on the image-text instance pair to learn their representations, thereby …
approaches only rely on the image-text instance pair to learn their representations, thereby …