Noise-contrastive estimation: A new estimation principle for unnormalized statistical models

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

被引用次数：166 相关文章所有 7 个版本

[PDF] arxiv.org

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

被引用次数：178 相关文章所有 4 个版本

[PDF] arxiv.org

The forward-forward algorithm: Some preliminary investigations

G Hinton - arXiv preprint arXiv:2212.13345, 2022 - arxiv.org

The aim of this paper is to introduce a new learning procedure for neural networks and to
demonstrate that it works well enough on a few small problems to be worth further …

被引用次数：249 相关文章所有 7 个版本

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

被引用次数：149 相关文章所有 6 个版本

[PDF] thecvf.com

Rethinking semantic segmentation: A prototype view

T Zhou, W Wang, E Konukoglu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Prevalent semantic segmentation solutions, despite their different network designs (FCN
based or attention based) and mask decoding strategies (parametric softmax based or pixel …

被引用次数：297 相关文章所有 10 个版本

[PDF] arxiv.org

Text and code embeddings by contrastive pre-training

A Neelakantan, T Xu, R Puri, A Radford, JM Han… - arXiv preprint arXiv …, 2022 - arxiv.org

Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …

被引用次数：358 相关文章所有 5 个版本

[PDF] thecvf.com

An empirical study of training end-to-end vision-and-language transformers

ZY Dou, Y Xu, Z Gan, J Wang, S Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Vision-and-language (VL) pre-training has proven to be highly effective on various
VL downstream tasks. While recent work has shown that fully transformer-based VL models …

被引用次数：354 相关文章所有 6 个版本

[PDF] arxiv.org

Self-supervised learning for recommender systems: A survey

J Yu, H Yin, X Xia, T Chen, J Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In recent years, neural architecture-based recommender systems have achieved
tremendous success, but they still fall short of expectation when dealing with highly sparse …

被引用次数：249 相关文章所有 7 个版本

[PDF] github.io

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org

The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

被引用次数：377 相关文章所有 11 个版本

[PDF] thecvf.com

Emerging properties in self-supervised vision transformers

M Caron, H Touvron, I Misra, H Jégou… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this paper, we question if self-supervised learning provides new properties to Vision
Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the …

被引用次数：5100 相关文章所有 16 个版本