What can transformers learn in-context? a case study of simple function classes

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc
In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …

On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arXiv preprint arXiv …, 2023 - arxiv.org
The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

Deep learning through the lens of example difficulty

R Baldock, H Maennel… - Advances in Neural …, 2021 - proceedings.neurips.cc
Existing work on understanding deep learning often employs measures that compress all
data-dependent information into a few numbers. In this work, we adopt a perspective based …

No train no gain: Revisiting efficient training algorithms for transformer-based language models

J Kaddour, O Key, P Nawrot… - Advances in Neural …, 2024 - proceedings.neurips.cc
The computation necessary for training Transformer-based language models has
skyrocketed in recent years. This trend has motivated research on efficient training …

Acpl: Anti-curriculum pseudo-labelling for semi-supervised medical image classification

F Liu, Y Tian, Y Chen, Y Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Effective semi-supervised learning (SSL) in medical image analysis (MIA) must address two
challenges: 1) work effectively on both multi-class (eg, lesion classification) and multi-label …

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Directed graph contrastive learning

Z Tong, Y Liang, H Ding, Y Dai… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract Graph Contrastive Learning (GCL) has emerged to learn generalizable
representations from contrastive views. However, it is still in its infancy with two concerns: 1) …

Variational annealing on graphs for combinatorial optimization

S Sanokowski, W Berghammer… - Advances in …, 2023 - proceedings.neurips.cc
Several recent unsupervised learning methods use probabilistic approaches to solve
combinatorial optimization (CO) problems based on the assumption of statistically …

Self-paced contrastive learning for semi-supervised medical image segmentation with meta-labels

J Peng, P Wang, C Desrosiers… - Advances in Neural …, 2021 - proceedings.neurips.cc
The contrastive pre-training of a recognition model on a large dataset of unlabeled data
often boosts the model's performance on downstream tasks like image classification …

Curriculum reinforcement learning via constrained optimal transport

P Klink, H Yang, C D'Eramo, J Peters… - International …, 2022 - proceedings.mlr.press
Curriculum reinforcement learning (CRL) allows solving complex tasks by generating a
tailored sequence of learning tasks, starting from easy ones and subsequently increasing …