- 学术资源搜索

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

被引用次数：561 相关文章所有 2 个版本

[PDF] acm.org

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：55 相关文章

[PDF] neurips.cc

Stablerep: Synthetic images from text-to-image models make strong visual representation learners

Y Tian, L Fan, P Isola, H Chang… - Advances in Neural …, 2024 - proceedings.neurips.cc

We investigate the potential of learning visual representations using synthetic images
generated by text-to-image models. This is a natural question in the light of the excellent …

被引用次数：114 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations

Z Zhao, L Alzubaidi, J Zhang, Y Duan, Y Gu - Expert Systems with …, 2024 - Elsevier

Deep learning has emerged as a powerful tool in various domains, revolutionising machine
learning research. However, one persistent challenge is the scarcity of labelled training …

被引用次数：91 相关文章所有 4 个版本

[PDF] arxiv.org

ibot: Image bert pre-training with online tokenizer

J Zhou, C Wei, H Wang, W Shen, C Xie, A Yuille… - arXiv preprint arXiv …, 2021 - arxiv.org

The success of language Transformers is primarily attributed to the pretext task of masked
language modeling (MLM), where texts are first tokenized into semantically meaningful …

被引用次数：901 相关文章所有 3 个版本

[PDF] arxiv.org

Context autoencoder for self-supervised representation learning

X Chen, M Ding, X Wang, Y Xin, S Mo, Y Wang… - International Journal of …, 2024 - Springer

We present a novel masked image modeling (MIM) approach, context autoencoder (CAE),
for self-supervised representation pretraining. We pretrain an encoder by making predictions …

被引用次数：382 相关文章所有 5 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：4425 相关文章所有 2 个版本

[PDF] thecvf.com

Unified contrastive learning in image-text-label space

J Yang, C Li, P Zhang, B Xiao, C Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Visual recognition is recently learned via either supervised learning on human-annotated
image-label data or language-image contrastive learning with webly-crawled image-text …

被引用次数：216 相关文章所有 5 个版本

[PDF] arxiv.org

Self-supervised learning for recommender systems: A survey

J Yu, H Yin, X Xia, T Chen, J Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In recent years, neural architecture-based recommender systems have achieved
tremendous success, but they still fall short of expectation when dealing with highly sparse …

被引用次数：287 相关文章所有 7 个版本

A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

被引用次数：103 相关文章所有 3 个版本