Statistical learning theory for control: A finite-sample perspective
Learning algorithms have become an integral component to modern engineering solutions.
Examples range from self-driving cars and recommender systems to finance and even …
Examples range from self-driving cars and recommender systems to finance and even …
Transformers as algorithms: Generalization and stability in in-context learning
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
A tutorial on the non-asymptotic theory of system identification
This tutorial serves as an introduction to recently developed non-asymptotic methods in the
theory of-mainly linear-system identification. We emphasize tools we deem particularly …
theory of-mainly linear-system identification. We emphasize tools we deem particularly …
Learning from many trajectories
We initiate a study of supervised learning from many independent sequences (" trajectories")
of non-independent covariates, reflecting tasks in sequence modeling, control, and …
of non-independent covariates, reflecting tasks in sequence modeling, control, and …
Optimistic active exploration of dynamical systems
Reinforcement learning algorithms commonly seek to optimize policies for solving one
particular task. How should we explore an unknown dynamical system such that the …
particular task. How should we explore an unknown dynamical system such that the …
Sharp rates in dependent learning theory: Avoiding sample size deflation for the square loss
In this work, we study statistical learning with dependent ($\beta $-mixing) data and square
loss in a hypothesis class $\mathscr {F}\subset L_ {\Psi_p} $ where $\Psi_p $ is the norm …
loss in a hypothesis class $\mathscr {F}\subset L_ {\Psi_p} $ where $\Psi_p $ is the norm …
PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs
In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-
series setting for a special class of discrete-time non-linear dynamical systems. This class …
series setting for a special class of discrete-time non-linear dynamical systems. This class …
Streaming PCA for Markovian data
Since its inception in 1982, Oja's algorithm has become an established method for
streaming principle component analysis (PCA). We study the problem of streaming PCA …
streaming principle component analysis (PCA). We study the problem of streaming PCA …
The noise level in linear regression with dependent data
We derive upper bounds for random design linear regression with dependent ($\beta $-
mixing) data absent any realizability assumptions. In contrast to the strictly realizable …
mixing) data absent any realizability assumptions. In contrast to the strictly realizable …
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Modern language models rely on the transformer architecture and attention mechanism to
perform language understanding and text generation. In this work, we study learning a 1 …
perform language understanding and text generation. In this work, we study learning a 1 …