Transformers as algorithms: Generalization and stability in in-context learning
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
What makes multi-modal learning better than single (provably)
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …
different modalities outperform their uni-modal counterparts, since more information is …
Spectral methods for data science: A statistical perspective
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …
Few-shot learning via learning the representation, provably
This paper studies few-shot learning via representation learning, where one uses $ T $
source tasks with $ n_1 $ data per task to learn a representation in order to reduce the …
source tasks with $ n_1 $ data per task to learn a representation in order to reduce the …
Revisiting scalarization in multi-task learning: A theoretical perspective
Linear scalarization, ie, combining all loss functions by a weighted sum, has been the
default choice in the literature of multi-task learning (MTL) since its inception. In recent years …
default choice in the literature of multi-task learning (MTL) since its inception. In recent years …
C-mixup: Improving generalization in regression
Improving the generalization of deep networks is an important open challenge, particularly
in domains without plentiful data. The mixup algorithm improves generalization by linearly …
in domains without plentiful data. The mixup algorithm improves generalization by linearly …
On the theory of transfer learning: The importance of task diversity
We provide new statistical guarantees for transfer learning via representation learning--
when transfer is achieved by learning a feature representation shared across different tasks …
when transfer is achieved by learning a feature representation shared across different tasks …
Fedavg with fine tuning: Local updates lead to representation learning
Abstract The Federated Averaging (FedAvg) algorithm, which consists of alternating
between a few local stochastic gradient updates at client nodes, followed by a model …
between a few local stochastic gradient updates at client nodes, followed by a model …
Meta-learning approaches for learning-to-learn in deep learning: A survey
Y Tian, X Zhao, W Huang - Neurocomputing, 2022 - Elsevier
Compared to traditional machine learning, deep learning can learn deeper abstract data
representation and understand scattered data properties. It has gained considerable …
representation and understand scattered data properties. It has gained considerable …
Estimation and inference for high-dimensional generalized linear models with knowledge transfer
Transfer learning provides a powerful tool for incorporating data from related studies into a
target study of interest. In epidemiology and medical studies, the classification of a target …
target study of interest. In epidemiology and medical studies, the classification of a target …