Deep learning: a statistical viewpoint
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
M Belkin - Acta Numerica, 2021 - cambridge.org
In the past decade the mathematical theory of machine learning has lagged far behind the
triumphs of deep neural networks on practical challenges. However, the gap between theory …
triumphs of deep neural networks on practical challenges. However, the gap between theory …
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
The success of deep learning is due, to a large extent, to the remarkable effectiveness of
gradient-based optimization methods applied to large neural networks. The purpose of this …
gradient-based optimization methods applied to large neural networks. The purpose of this …
Graph neural networks are inherently good generalizers: Insights by bridging gnns and mlps
Graph neural networks (GNNs), as the de-facto model class for representation learning on
graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional …
graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional …
Optimization of graph neural networks: Implicit acceleration by skip connections and more depth
Abstract Graph Neural Networks (GNNs) have been studied through the lens of expressive
power and generalization. However, their optimization properties are less well understood …
power and generalization. However, their optimization properties are less well understood …
Score-based generative neural networks for large-scale optimal transport
We consider the fundamental problem of sampling the optimal transport coupling between
given source and target distributions. In certain cases, the optimal transport plan takes the …
given source and target distributions. In certain cases, the optimal transport plan takes the …
The interpolation phase transition in neural networks: Memorization and generalization under lazy training
A Montanari, Y Zhong - The Annals of Statistics, 2022 - projecteuclid.org
The interpolation phase transition in neural networks: Memorization and generalization
under lazy training Page 1 The Annals of Statistics 2022, Vol. 50, No. 5, 2816–2847 https://doi.org/10.1214/22-AOS2211 …
under lazy training Page 1 The Annals of Statistics 2022, Vol. 50, No. 5, 2816–2847 https://doi.org/10.1214/22-AOS2211 …
Neural networks as kernel learners: The silent alignment effect
Neural networks in the lazy training regime converge to kernel machines. Can neural
networks in the rich feature learning regime learn a kernel machine with a data-dependent …
networks in the rich feature learning regime learn a kernel machine with a data-dependent …
Training multi-layer over-parametrized neural network in subquadratic time
We consider the problem of training a multi-layer over-parametrized neural network to
minimize the empirical risk induced by a loss function. In the typical setting of over …
minimize the empirical risk induced by a loss function. In the typical setting of over …
Numerical analysis of physics-informed neural networks and related models in physics-informed machine learning
Physics-informed neural networks (PINNs) and their variants have been very popular in
recent years as algorithms for the numerical simulation of both forward and inverse …
recent years as algorithms for the numerical simulation of both forward and inverse …