Generalisation error in learning with random features and the hidden manifold model

A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning

Y Dar, V Muthukumar, RG Baraniuk - arXiv preprint arXiv:2109.02355, 2021 - arxiv.org

The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …

被引用次数：87 相关文章所有 2 个版本

[PDF] neurips.cc

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

被引用次数：138 相关文章所有 9 个版本

[PDF] neurips.cc

Learning in the presence of low-dimensional structure: a spiked random matrix perspective

J Ba, MA Erdogdu, T Suzuki… - Advances in Neural …, 2024 - proceedings.neurips.cc

We consider the learning of a single-index target function $ f_*:\mathbb {R}^ d\to\mathbb {R}
$ under spiked covariance data: $$ f_*(\boldsymbol {x})=\textstyle\sigma_*(\frac {1}{\sqrt …

被引用次数：28 相关文章所有 5 个版本

[PDF] neurips.cc

Towards understanding grokking: An effective theory of representation learning

Z Liu, O Kitouni, NS Nolte, E Michaud… - Advances in …, 2022 - proceedings.neurips.cc

We aim to understand grokking, a phenomenon where models generalize long after
overfitting their training set. We present both a microscopic analysis anchored by an effective …

被引用次数：137 相关文章所有 8 个版本

[HTML] nih.gov

[HTML][HTML] Surprises in high-dimensional ridgeless least squares interpolation

T Hastie, A Montanari, S Rosset, RJ Tibshirani - Annals of statistics, 2022 - ncbi.nlm.nih.gov

Interpolators—estimators that achieve zero training error—have attracted growing attention
in machine learning, mainly because state-of-the art neural networks appear to be models of …

被引用次数：945 相关文章所有 16 个版本

[PDF] mlr.press

Universality of empirical risk minimization

A Montanari, BN Saeed - Conference on Learning Theory, 2022 - proceedings.mlr.press

Consider supervised learning from iid samples {(y_i, x_i)} _ {i≤ n} where x_i∈ R_p are
feature vectors and y_i∈ R are labels. We study empirical risk minimization over a class of …

被引用次数：108 相关文章所有 3 个版本

[PDF] neurips.cc

Learning curves of generic features maps for realistic datasets with a teacher-student model

B Loureiro, C Gerbelot, H Cui, S Goldt… - Advances in …, 2021 - proceedings.neurips.cc

Teacher-student models provide a framework in which the typical-case performance of high-
dimensional supervised learning can be described in closed form. The assumptions of …

被引用次数：160 相关文章所有 13 个版本

[PDF] arxiv.org

Random features for kernel approximation: A survey on algorithms, theory, and beyond

F Liu, X Huang, Y Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

The class of random features is one of the most popular techniques to speed up kernel
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …

被引用次数：211 相关文章所有 9 个版本

[PDF] arxiv.org

Universality laws for high-dimensional learning with random features

H Hu, YM Lu - IEEE Transactions on Information Theory, 2022 - ieeexplore.ieee.org

We prove a universality theorem for learning with random features. Our result shows that, in
terms of training and generalization errors, a random feature model with a nonlinear …

被引用次数：168 相关文章所有 7 个版本

[PDF] unimi.it

A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit

R Pacelli, S Ariosto, M Pastore, F Ginelli… - Nature Machine …, 2023 - nature.com

Despite the practical success of deep neural networks, a comprehensive theoretical
framework that can predict practically relevant scores, such as the test accuracy, from …

被引用次数：27 相关文章所有 11 个版本