Gradient dynamics of shallow univariate relu networks

G Vardi - Communications of the ACM, 2023 - dl.acm.org

On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

被引用次数：74 相关文章所有 5 个版本

[PDF] ncsu.edu

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer

Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

被引用次数：146 相关文章所有 7 个版本

[PDF] sciencedirect.com

On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks

S Wang, H Wang, P Perdikaris - Computer Methods in Applied Mechanics …, 2021 - Elsevier

Physics-informed neural networks (PINNs) are demonstrating remarkable promise in
integrating physical models with gappy and noisy observational data, but they still struggle …

被引用次数：408 相关文章所有 9 个版本

[PDF] arxiv.org

How neural networks extrapolate: From feedforward to graph neural networks

K Xu, M Zhang, J Li, SS Du, K Kawarabayashi… - arXiv preprint arXiv …, 2020 - arxiv.org

We study how neural networks trained by gradient descent extrapolate, ie, what they learn
outside the support of the training distribution. Previous works report mixed empirical results …

被引用次数：340 相关文章所有 6 个版本

[PDF] mlr.press

Universal approximation with deep narrow networks

P Kidger, T Lyons - Conference on learning theory, 2020 - proceedings.mlr.press

Abstract The classical Universal Approximation Theorem holds for neural networks of
arbitrary width and bounded depth. Here we consider the natural 'dual'scenario for networks …

被引用次数：419 相关文章所有 6 个版本

[PDF] mlr.press

Kernel and rich regimes in overparametrized models

B Woodworth, S Gunasekar, JD Lee… - … on Learning Theory, 2020 - proceedings.mlr.press

A recent line of work studies overparametrized neural networks in the “kernel regime,” ie
when during training the network behaves as a kernelized linear predictor, and thus, training …

被引用次数：366 相关文章所有 11 个版本

[PDF] neurips.cc

Gradient descent on two-layer nets: Margin maximization and simplicity bias

K Lyu, Z Li, R Wang, S Arora - Advances in Neural …, 2021 - proceedings.neurips.cc

The generalization mystery of overparametrized deep nets has motivated efforts to
understand how gradient descent (GD) converges to low-loss solutions that generalize well …

被引用次数：72 相关文章所有 7 个版本

[PDF] thecvf.com

Neural fields as learnable kernels for 3d reconstruction

F Williams, Z Gojcic, S Khamis… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract We present Neural Kernel Fields: a novel method for reconstructing implicit 3D
shapes based on a learned kernel ridge regression. Our technique achieves state-of-the-art …

被引用次数：54 相关文章所有 10 个版本

[PDF] mlr.press

Implicit regularization towards rank minimization in relu networks

N Timor, G Vardi, O Shamir - International Conference on …, 2023 - proceedings.mlr.press

We study the conjectured relationship between the implicit regularization in neural networks,
trained with gradient-based methods, and rank minimization of their weight matrices …

被引用次数：50 相关文章所有 4 个版本

[PDF] jmlr.org

Banach space representer theorems for neural networks and ridge splines

R Parhi, RD Nowak - Journal of Machine Learning Research, 2021 - jmlr.org

We develop a variational framework to understand the properties of the functions learned by
neural networks fit to data. We propose and study a family of continuous-domain linear …

被引用次数：115 相关文章所有 5 个版本