The gaussian equivalence of generative models for learning with shallow neural networks

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

被引用次数：103 相关文章所有 9 个版本

[PDF] neurips.cc

Towards understanding grokking: An effective theory of representation learning

Z Liu, O Kitouni, NS Nolte, E Michaud… - Advances in …, 2022 - proceedings.neurips.cc

We aim to understand grokking, a phenomenon where models generalize long after
overfitting their training set. We present both a microscopic analysis anchored by an effective …

被引用次数：94 相关文章所有 8 个版本

[PDF] neurips.cc

Learning curves of generic features maps for realistic datasets with a teacher-student model

B Loureiro, C Gerbelot, H Cui, S Goldt… - Advances in …, 2021 - proceedings.neurips.cc

Teacher-student models provide a framework in which the typical-case performance of high-
dimensional supervised learning can be described in closed form. The assumptions of …

被引用次数：134 相关文章所有 13 个版本

[PDF] arxiv.org

Universality laws for high-dimensional learning with random features

H Hu, YM Lu - IEEE Transactions on Information Theory, 2022 - ieeexplore.ieee.org

We prove a universality theorem for learning with random features. Our result shows that, in
terms of training and generalization errors, a random feature model with a nonlinear …

被引用次数：144 相关文章所有 7 个版本

[PDF] mlr.press

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press

We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

被引用次数：172 相关文章所有 15 个版本

[PDF] unimi.it

A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit

R Pacelli, S Ariosto, M Pastore, F Ginelli… - Nature Machine …, 2023 - nature.com

Despite the practical success of deep neural networks, a comprehensive theoretical
framework that can predict practically relevant scores, such as the test accuracy, from …

被引用次数：17 相关文章所有 11 个版本

[PDF] mlr.press

Deterministic equivalent and error universality of deep random features learning

D Schröder, H Cui, D Dmitriev… - … on Machine Learning, 2023 - proceedings.mlr.press

This manuscript considers the problem of learning a random Gaussian network function
using a fully connected network with frozen intermediate layers and trainable readout layer …

被引用次数：22 相关文章所有 11 个版本

[PDF] mlr.press

Bayes-optimal learning of deep random networks of extensive-width

H Cui, F Krzakala, L Zdeborová - … Conference on Machine …, 2023 - proceedings.mlr.press

We consider the problem of learning a target function corresponding to a deep, extensive-
width, non-linear neural network with random Gaussian weights. We consider the asymptotic …

被引用次数：25 相关文章所有 8 个版本

[PDF] mlr.press

Neural networks trained with SGD learn distributions of increasing complexity

M Refinetti, A Ingrosso, S Goldt - … Conference on Machine …, 2023 - proceedings.mlr.press

The uncanny ability of over-parameterised neural networks to generalise well has been
explained using various" simplicity biases". These theories postulate that neural networks …

被引用次数：24 相关文章所有 9 个版本

[PDF] neurips.cc

Precise learning curves and higher-order scalings for dot-product kernel regression

L Xiao, H Hu, T Misiakiewicz, Y Lu… - Advances in Neural …, 2022 - proceedings.neurips.cc

As modern machine learning models continue to advance the computational frontier, it has
become increasingly important to develop precise estimates for expected performance …

被引用次数：36 相关文章所有 7 个版本