On the linearity of large non-linear models: when and why the tangent kernel is constant

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

被引用次数：343 相关文章所有 12 个版本

[PDF] arxiv.org

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

M Belkin - Acta Numerica, 2021 - cambridge.org

In the past decade the mathematical theory of machine learning has lagged far behind the
triumphs of deep neural networks on practical challenges. However, the gap between theory …

被引用次数：241 相关文章所有 6 个版本

[PDF] sciencedirect.com

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks

C Liu, L Zhu, M Belkin - Applied and Computational Harmonic Analysis, 2022 - Elsevier

The success of deep learning is due, to a large extent, to the remarkable effectiveness of
gradient-based optimization methods applied to large neural networks. The purpose of this …

被引用次数：229 相关文章所有 6 个版本

[PDF] arxiv.org

Graph neural networks are inherently good generalizers: Insights by bridging gnns and mlps

C Yang, Q Wu, J Wang, J Yan - arXiv preprint arXiv:2212.09034, 2022 - arxiv.org

Graph neural networks (GNNs), as the de-facto model class for representation learning on
graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional …

被引用次数：58 相关文章所有 3 个版本

[PDF] mlr.press

Optimization of graph neural networks: Implicit acceleration by skip connections and more depth

K Xu, M Zhang, S Jegelka… - … on Machine Learning, 2021 - proceedings.mlr.press

Abstract Graph Neural Networks (GNNs) have been studied through the lens of expressive
power and generalization. However, their optimization properties are less well understood …

被引用次数：87 相关文章所有 9 个版本

[PDF] neurips.cc

Score-based generative neural networks for large-scale optimal transport

M Daniels, T Maunu, P Hand - Advances in neural …, 2021 - proceedings.neurips.cc

We consider the fundamental problem of sampling the optimal transport coupling between
given source and target distributions. In certain cases, the optimal transport plan takes the …

被引用次数：61 相关文章所有 10 个版本

[PDF] arxiv.org

The interpolation phase transition in neural networks: Memorization and generalization under lazy training

A Montanari, Y Zhong - The Annals of Statistics, 2022 - projecteuclid.org

The interpolation phase transition in neural networks: Memorization and generalization
under lazy training Page 1 The Annals of Statistics 2022, Vol. 50, No. 5, 2816–2847 https://doi.org/10.1214/22-AOS2211 …

被引用次数：101 相关文章所有 4 个版本

[PDF] arxiv.org

Neural networks as kernel learners: The silent alignment effect

A Atanasov, B Bordelon, C Pehlevan - arXiv preprint arXiv:2111.00034, 2021 - arxiv.org

Neural networks in the lazy training regime converge to kernel machines. Can neural
networks in the rich feature learning regime learn a kernel machine with a data-dependent …

被引用次数：72 相关文章所有 7 个版本

[PDF] arxiv.org

Training multi-layer over-parametrized neural network in subquadratic time

Z Song, L Zhang, R Zhang - arXiv preprint arXiv:2112.07628, 2021 - arxiv.org

We consider the problem of training a multi-layer over-parametrized neural network to
minimize the empirical risk induced by a loss function. In the typical setting of over …

被引用次数：62 相关文章所有 6 个版本

[PDF] arxiv.org

Numerical analysis of physics-informed neural networks and related models in physics-informed machine learning

T De Ryck, S Mishra - arXiv preprint arXiv:2402.10926, 2024 - arxiv.org

Physics-informed neural networks (PINNs) and their variants have been very popular in
recent years as algorithms for the numerical simulation of both forward and inverse …

被引用次数：5 相关文章所有 3 个版本