- 学术资源搜索

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

被引用次数：106 相关文章所有 9 个版本

[PDF] neurips.cc

Learning single-index models with shallow neural networks

A Bietti, J Bruna, C Sanford… - Advances in Neural …, 2022 - proceedings.neurips.cc

Single-index models are a class of functions given by an unknown univariate``link''function
applied to an unknown one-dimensional projection of the input. These models are …

被引用次数：66 相关文章所有 12 个版本

[PDF] neurips.cc

High-dimensional limit theorems for sgd: Effective dynamics and critical scaling

G Ben Arous, R Gheissari… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in
the high-dimensional regime. We prove limit theorems for the trajectories of summary …

被引用次数：60 相关文章所有 12 个版本

[PDF] neurips.cc

Provable guarantees for neural networks via gradient feature learning

Z Shi, J Wei, Y Liang - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Neural networks have achieved remarkable empirical performance, while the current
theoretical analysis is not adequate for understanding their success, eg, the Neural Tangent …

被引用次数：9 相关文章所有 6 个版本

[PDF] neurips.cc

Dynamics of finite width kernel and prediction fluctuations in mean field neural networks

B Bordelon, C Pehlevan - Advances in Neural Information …, 2024 - proceedings.neurips.cc

We analyze the dynamics of finite width effects in wide but finite feature learning neural
networks. Starting from a dynamical mean field theory description of infinite width deep …

被引用次数：20 相关文章所有 8 个版本

[PDF] arxiv.org

Neural networks efficiently learn low-dimensional representations with sgd

A Mousavi-Hosseini, S Park, M Girotti… - arXiv preprint arXiv …, 2022 - arxiv.org

We study the problem of training a two-layer neural network (NN) of arbitrary width using
stochastic gradient descent (SGD) where the input $\boldsymbol {x}\in\mathbb {R}^ d $ is …

被引用次数：48 相关文章所有 9 个版本

[PDF] pnas.org Full View

Data-driven emergence of convolutional structure in neural networks

A Ingrosso, S Goldt - … of the National Academy of Sciences, 2022 - National Acad Sciences

Exploiting data invariances is crucial for efficient learning in both artificial and biological
neural circuits. Understanding how neural networks can discover appropriate …

被引用次数：36 相关文章所有 11 个版本

[PDF] mlr.press

From high-dimensional & mean-field dynamics to dimensionless odes: A unifying approach to sgd in two-layers networks

L Arnaboldi, L Stephan, F Krzakala… - The Thirty Sixth …, 2023 - proceedings.mlr.press

This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a
two-layer neural network trained on Gaussian data and labels generated by a similar …

被引用次数：23 相关文章所有 6 个版本

[PDF] pnas.org Full View

On the different regimes of stochastic gradient descent

A Sclocchi, M Wyart - … of the National Academy of Sciences, 2024 - National Acad Sciences

Modern deep networks are trained with stochastic gradient descent (SGD) whose key
hyperparameters are the number of data considered at each step or batch size B, and the …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Rigorous dynamical mean-field theory for stochastic gradient descent methods

C Gerbelot, E Troiani, F Mignacco, F Krzakala… - SIAM Journal on …, 2024 - SIAM

We prove closed-form equations for the exact high-dimensional asymptotics of a family of
first-order gradient-based methods, learning an estimator (eg, M-estimator, shallow neural …

被引用次数：24 相关文章所有 3 个版本