Learning with little mixing

A Tsiamis, I Ziemann, N Matni… - IEEE Control Systems …, 2023 - ieeexplore.ieee.org

Learning algorithms have become an integral component to modern engineering solutions.
Examples range from self-driving cars and recommender systems to finance and even …

被引用次数：67 相关文章所有 8 个版本

[PDF] mlr.press

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press

In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

被引用次数：120 相关文章所有 8 个版本

[PDF] arxiv.org

A tutorial on the non-asymptotic theory of system identification

I Ziemann, A Tsiamis, B Lee, Y Jedra… - 2023 62nd IEEE …, 2023 - ieeexplore.ieee.org

This tutorial serves as an introduction to recently developed non-asymptotic methods in the
theory of-mainly linear-system identification. We emphasize tools we deem particularly …

被引用次数：17 相关文章所有 8 个版本

[PDF] jmlr.org

Learning from many trajectories

S Tu, R Frostig, M Soltanolkotabi - Journal of Machine Learning Research, 2024 - jmlr.org

We initiate a study of supervised learning from many independent sequences (" trajectories")
of non-independent covariates, reflecting tasks in sequence modeling, control, and …

被引用次数：28 相关文章所有 2 个版本

[PDF] neurips.cc

Optimistic active exploration of dynamical systems

L Treven, C Sancaktar, S Blaes… - Advances in Neural …, 2023 - proceedings.neurips.cc

Reinforcement learning algorithms commonly seek to optimize policies for solving one
particular task. How should we explore an unknown dynamical system such that the …

被引用次数：14 相关文章所有 6 个版本

[PDF] arxiv.org

Sharp rates in dependent learning theory: Avoiding sample size deflation for the square loss

I Ziemann, S Tu, GJ Pappas, N Matni - arXiv preprint arXiv:2402.05928, 2024 - arxiv.org

In this work, we study statistical learning with dependent ($\beta $-mixing) data and square
loss in a hypothesis class $\mathscr {F}\subset L_ {\Psi_p} $ where $\Psi_p $ is the norm …

被引用次数：7 相关文章所有 3 个版本

[PDF] aaai.org

PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

D Eringis, J Leth, ZH Tan, R Wisniewski… - Proceedings of the …, 2024 - ojs.aaai.org

In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-
series setting for a special class of discrete-time non-linear dynamical systems. This class …

被引用次数：3 相关文章所有 8 个版本

[PDF] neurips.cc

Streaming PCA for Markovian data

S Kumar, P Sarkar - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Since its inception in 1982, Oja's algorithm has become an established method for
streaming principle component analysis (PCA). We study the problem of streaming PCA …

被引用次数：6 相关文章所有 9 个版本

[PDF] neurips.cc

The noise level in linear regression with dependent data

I Ziemann, S Tu, GJ Pappas… - Advances in Neural …, 2024 - proceedings.neurips.cc

We derive upper bounds for random design linear regression with dependent ($\beta $-
mixing) data absent any realizability assumptions. In contrast to the strictly realizable …

被引用次数：5 相关文章所有 7 个版本

[PDF] arxiv.org

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

ME Ildiz, Y Huang, Y Li, AS Rawat, S Oymak - arXiv preprint arXiv …, 2024 - arxiv.org

Modern language models rely on the transformer architecture and attention mechanism to
perform language understanding and text generation. In this work, we study learning a 1 …

被引用次数：8 相关文章所有 2 个版本

Statistical learning theory for control: A finite-sample perspective

Transformers as algorithms: Generalization and stability in in-context learning

A tutorial on the non-asymptotic theory of system identification

Learning from many trajectories

Optimistic active exploration of dynamical systems

Sharp rates in dependent learning theory: Avoiding sample size deflation for the square loss

PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

Streaming PCA for Markovian data

The noise level in linear regression with dependent data

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

高级搜索

引用