[HTML][HTML] Large-scale multi-modal pre-trained models: A comprehensive survey

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer
With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

[HTML][HTML] Integrating machine learning with human knowledge

C Deng, X Ji, C Rainey, J Zhang, W Lu - Iscience, 2020 - cell.com
Machine learning has been heavily researched and widely used in many disciplines.
However, achieving high accuracy requires a large amount of data that is sometimes …

General multi-label image classification with transformers

J Lanchantin, T Wang, V Ordonez… - Proceedings of the …, 2021 - openaccess.thecvf.com
Multi-label image classification is the task of predicting a set of labels corresponding to
objects, attributes or other entities present in an image. In this work we propose the …

Mos: Towards scaling out-of-distribution detection for large semantic space

R Huang, Y Li - Proceedings of the IEEE/CVF Conference …, 2021 - openaccess.thecvf.com
Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying
machine learning models in the real world. Existing solutions are mainly driven by small …

A survey of zero-shot learning: Settings, methods, and applications

W Wang, VW Zheng, H Yu, C Miao - ACM Transactions on Intelligent …, 2019 - dl.acm.org
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …

Logic-induced diagnostic reasoning for semi-supervised semantic segmentation

C Liang, W Wang, J Miao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recent advances in semi-supervised semantic segmentation have been heavily reliant on
pseudo labeling to compensate for limited labeled data, disregarding the valuable relational …

Zero-shot recognition via semantic embeddings and knowledge graphs

X Wang, Y Ye, A Gupta - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
We consider the problem of zero-shot recognition: learning a visual classifier for a category
with zero training examples, just using the word embedding of the category and its …

Knowledge-embedded routing network for scene graph generation

T Chen, W Yu, R Chen, L Lin - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
To understand a scene in depth not only involves locating/recognizing individual objects, but
also requires to infer the relationships and interactions among them. However, since the …

Optimization methods for large-scale machine learning

L Bottou, FE Curtis, J Nocedal - SIAM review, 2018 - SIAM
This paper provides a review and commentary on the past, present, and future of numerical
optimization algorithms in the context of machine learning applications. Through case …

Consensus-aware visual-semantic embedding for image-text matching

H Wang, Y Zhang, Z Ji, Y Pang, L Ma - … , Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
Image-text matching plays a central role in bridging vision and language. Most existing
approaches only rely on the image-text instance pair to learn their representations, thereby …