Representations and generalization in artificial and brain neural networks

Q Li, B Sorscher, H Sompolinsky - Proceedings of the National Academy of …, 2024 - pnas.org
Humans and animals excel at generalizing from limited data, a capability yet to be fully
replicated in artificial intelligence. This perspective investigates generalization in biological …

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

Learning in the presence of low-dimensional structure: a spiked random matrix perspective

J Ba, MA Erdogdu, T Suzuki… - Advances in Neural …, 2024 - proceedings.neurips.cc
We consider the learning of a single-index target function $ f_*:\mathbb {R}^ d\to\mathbb {R}
$ under spiked covariance data: $$ f_*(\boldsymbol {x})=\textstyle\sigma_*(\frac {1}{\sqrt …

[HTML][HTML] Surprises in high-dimensional ridgeless least squares interpolation

T Hastie, A Montanari, S Rosset, RJ Tibshirani - Annals of statistics, 2022 - ncbi.nlm.nih.gov
Interpolators—estimators that achieve zero training error—have attracted growing attention
in machine learning, mainly because state-of-the art neural networks appear to be models of …

Universality of empirical risk minimization

A Montanari, BN Saeed - Conference on Learning Theory, 2022 - proceedings.mlr.press
Consider supervised learning from iid samples {(y_i, x_i)} _ {i≤ n} where x_i∈ R_p are
feature vectors and y_i∈ R are labels. We study empirical risk minimization over a class of …

Learning curves of generic features maps for realistic datasets with a teacher-student model

B Loureiro, C Gerbelot, H Cui, S Goldt… - Advances in …, 2021 - proceedings.neurips.cc
Teacher-student models provide a framework in which the typical-case performance of high-
dimensional supervised learning can be described in closed form. The assumptions of …

Random features for kernel approximation: A survey on algorithms, theory, and beyond

F Liu, X Huang, Y Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The class of random features is one of the most popular techniques to speed up kernel
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press
We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

On the Optimal Weighted Regularization in Overparameterized Linear Regression

D Wu, J Xu - Advances in Neural Information Processing …, 2020 - proceedings.neurips.cc
We consider the linear model $\vy=\vX\vbeta_ {\star}+\vepsilon $ with $\vX\in\mathbb
{R}^{n\times p} $ in the overparameterized regime $ p> n $. We estimate $\vbeta_ {\star} …