Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

What makes multi-modal learning better than single (provably)

Y Huang, C Du, Z Xue, X Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …

Spectral methods for data science: A statistical perspective

Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …

Few-shot learning via learning the representation, provably

SS Du, W Hu, SM Kakade, JD Lee, Q Lei - arXiv preprint arXiv:2002.09434, 2020 - arxiv.org
This paper studies few-shot learning via representation learning, where one uses $ T $
source tasks with $ n_1 $ data per task to learn a representation in order to reduce the …

Revisiting scalarization in multi-task learning: A theoretical perspective

Y Hu, R Xian, Q Wu, Q Fan, L Yin… - Advances in Neural …, 2024 - proceedings.neurips.cc
Linear scalarization, ie, combining all loss functions by a weighted sum, has been the
default choice in the literature of multi-task learning (MTL) since its inception. In recent years …

C-mixup: Improving generalization in regression

H Yao, Y Wang, L Zhang, JY Zou… - Advances in neural …, 2022 - proceedings.neurips.cc
Improving the generalization of deep networks is an important open challenge, particularly
in domains without plentiful data. The mixup algorithm improves generalization by linearly …

On the theory of transfer learning: The importance of task diversity

N Tripuraneni, M Jordan, C Jin - Advances in neural …, 2020 - proceedings.neurips.cc
We provide new statistical guarantees for transfer learning via representation learning--
when transfer is achieved by learning a feature representation shared across different tasks …

Fedavg with fine tuning: Local updates lead to representation learning

L Collins, H Hassani, A Mokhtari… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract The Federated Averaging (FedAvg) algorithm, which consists of alternating
between a few local stochastic gradient updates at client nodes, followed by a model …

Meta-learning approaches for learning-to-learn in deep learning: A survey

Y Tian, X Zhao, W Huang - Neurocomputing, 2022 - Elsevier
Compared to traditional machine learning, deep learning can learn deeper abstract data
representation and understand scattered data properties. It has gained considerable …

Estimation and inference for high-dimensional generalized linear models with knowledge transfer

S Li, L Zhang, TT Cai, H Li - Journal of the American Statistical …, 2024 - Taylor & Francis
Transfer learning provides a powerful tool for incorporating data from related studies into a
target study of interest. In epidemiology and medical studies, the classification of a target …