Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

M Belkin - Acta Numerica, 2021 - cambridge.org
In the past decade the mathematical theory of machine learning has lagged far behind the
triumphs of deep neural networks on practical challenges. However, the gap between theory …

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks

C Liu, L Zhu, M Belkin - Applied and Computational Harmonic Analysis, 2022 - Elsevier
The success of deep learning is due, to a large extent, to the remarkable effectiveness of
gradient-based optimization methods applied to large neural networks. The purpose of this …

Graph neural networks are inherently good generalizers: Insights by bridging gnns and mlps

C Yang, Q Wu, J Wang, J Yan - arXiv preprint arXiv:2212.09034, 2022 - arxiv.org
Graph neural networks (GNNs), as the de-facto model class for representation learning on
graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional …

Optimization of graph neural networks: Implicit acceleration by skip connections and more depth

K Xu, M Zhang, S Jegelka… - … on Machine Learning, 2021 - proceedings.mlr.press
Abstract Graph Neural Networks (GNNs) have been studied through the lens of expressive
power and generalization. However, their optimization properties are less well understood …

Score-based generative neural networks for large-scale optimal transport

M Daniels, T Maunu, P Hand - Advances in neural …, 2021 - proceedings.neurips.cc
We consider the fundamental problem of sampling the optimal transport coupling between
given source and target distributions. In certain cases, the optimal transport plan takes the …

The interpolation phase transition in neural networks: Memorization and generalization under lazy training

A Montanari, Y Zhong - The Annals of Statistics, 2022 - projecteuclid.org
The interpolation phase transition in neural networks: Memorization and generalization
under lazy training Page 1 The Annals of Statistics 2022, Vol. 50, No. 5, 2816–2847 https://doi.org/10.1214/22-AOS2211 …

Neural networks as kernel learners: The silent alignment effect

A Atanasov, B Bordelon, C Pehlevan - arXiv preprint arXiv:2111.00034, 2021 - arxiv.org
Neural networks in the lazy training regime converge to kernel machines. Can neural
networks in the rich feature learning regime learn a kernel machine with a data-dependent …

Training multi-layer over-parametrized neural network in subquadratic time

Z Song, L Zhang, R Zhang - arXiv preprint arXiv:2112.07628, 2021 - arxiv.org
We consider the problem of training a multi-layer over-parametrized neural network to
minimize the empirical risk induced by a loss function. In the typical setting of over …

Numerical analysis of physics-informed neural networks and related models in physics-informed machine learning

T De Ryck, S Mishra - arXiv preprint arXiv:2402.10926, 2024 - arxiv.org
Physics-informed neural networks (PINNs) and their variants have been very popular in
recent years as algorithms for the numerical simulation of both forward and inverse …