Statistical mechanics of deep learning
The recent striking success of deep neural networks in machine learning raises profound
questions about the theoretical principles underlying their success. For example, what can …
questions about the theoretical principles underlying their success. For example, what can …
Overview frequency principle/spectral bias in deep learning
Understanding deep learning is increasingly emergent as it penetrates more and more into
industry and science. In recent years, a research line from Fourier analysis sheds light on …
industry and science. In recent years, a research line from Fourier analysis sheds light on …
Deep learning: a statistical viewpoint
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
Wide neural networks of any depth evolve as linear models under gradient descent
A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …
and generalization. However, the often complex loss landscapes of neural networks have …
Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable,
optimization program, in a quest to better understand deep neural networks that are trained …
optimization program, in a quest to better understand deep neural networks that are trained …
Neural collapse under mse loss: Proximity to and dynamics on the central path
The recently discovered Neural Collapse (NC) phenomenon occurs pervasively in today's
deep net training paradigm of driving cross-entropy (CE) loss towards zero. During NC, last …
deep net training paradigm of driving cross-entropy (CE) loss towards zero. During NC, last …
Modeling the influence of data structure on learning in neural networks: The hidden manifold model
Understanding the reasons for the success of deep neural networks trained using stochastic
gradient-based methods is a key open problem for the nascent theory of deep learning. The …
gradient-based methods is a key open problem for the nascent theory of deep learning. The …
Dynamics of finite width kernel and prediction fluctuations in mean field neural networks
B Bordelon, C Pehlevan - Advances in Neural Information …, 2024 - proceedings.neurips.cc
We analyze the dynamics of finite width effects in wide but finite feature learning neural
networks. Starting from a dynamical mean field theory description of infinite width deep …
networks. Starting from a dynamical mean field theory description of infinite width deep …
Finite depth and width corrections to the neural tangent kernel
We prove the precise scaling, at finite depth and width, for the mean and variance of the
neural tangent kernel (NTK) in a randomly initialized ReLU network. The standard deviation …
neural tangent kernel (NTK) in a randomly initialized ReLU network. The standard deviation …
The gaussian equivalence of generative models for learning with shallow neural networks
Understanding the impact of data structure on the computational tractability of learning is a
key challenge for the theory of neural networks. Many theoretical works do not explicitly …
key challenge for the theory of neural networks. Many theoretical works do not explicitly …