A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Stablerep: Synthetic images from text-to-image models make strong visual representation learners
We investigate the potential of learning visual representations using synthetic images
generated by text-to-image models. This is a natural question in the light of the excellent …
generated by text-to-image models. This is a natural question in the light of the excellent …
[HTML][HTML] A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations
Deep learning has emerged as a powerful tool in various domains, revolutionising machine
learning research. However, one persistent challenge is the scarcity of labelled training …
learning research. However, one persistent challenge is the scarcity of labelled training …
ibot: Image bert pre-training with online tokenizer
The success of language Transformers is primarily attributed to the pretext task of masked
language modeling (MLM), where texts are first tokenized into semantically meaningful …
language modeling (MLM), where texts are first tokenized into semantically meaningful …
Context autoencoder for self-supervised representation learning
We present a novel masked image modeling (MIM) approach, context autoencoder (CAE),
for self-supervised representation pretraining. We pretrain an encoder by making predictions …
for self-supervised representation pretraining. We pretrain an encoder by making predictions …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Unified contrastive learning in image-text-label space
Visual recognition is recently learned via either supervised learning on human-annotated
image-label data or language-image contrastive learning with webly-crawled image-text …
image-label data or language-image contrastive learning with webly-crawled image-text …
Self-supervised learning for recommender systems: A survey
In recent years, neural architecture-based recommender systems have achieved
tremendous success, but they still fall short of expectation when dealing with highly sparse …
tremendous success, but they still fall short of expectation when dealing with highly sparse …
A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …
achieve satisfactory performance. However, the process of collecting and labeling such data …