Solving the Kolmogorov PDE by means of deep learning

C Beck, S Becker, P Grohs, N Jaafari… - Journal of Scientific …, 2021 - Springer
Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations
(PDEs) associated to them have been widely used in models from engineering, finance, and …

On the bias-variance-cost tradeoff of stochastic optimization

Y Hu, X Chen, N He - Advances in Neural Information …, 2021 - proceedings.neurips.cc
We consider stochastic optimization when one only has access to biased stochastic oracles
of the objective, and obtaining stochastic gradients with low biases comes at high costs. This …

Full error analysis for the training of deep neural networks

C Beck, A Jentzen, B Kuckuck - Infinite Dimensional Analysis …, 2022 - World Scientific
Deep learning algorithms have been applied very successfully in recent years to a range of
problems out of reach for classical solution paradigms. Nevertheless, there is no completely …

Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes

S Dereich, S Kassing - arXiv preprint arXiv:2102.09385, 2021 - arxiv.org
In this article, we consider convergence of stochastic gradient descent schemes (SGD)
under weak assumptions on the underlying landscape. More explicitly, we show that on the …

A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise …

A Jentzen, A Riekert - Journal of Machine Learning Research, 2022 - jmlr.org
Gradient descent (GD) type optimization methods are the standard instrument to train
artificial neural networks (ANNs) with recti_ed linear unit (ReLU) activation. Despite the …

Strong error analysis for stochastic gradient descent optimization algorithms

A Jentzen, B Kuckuck, A Neufeld… - IMA Journal of …, 2021 - academic.oup.com
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of
machine learning applications. In this article we perform a rigorous strong error analysis for …

Blow up phenomena for gradient descent optimization methods in the training of artificial neural networks

D Gallon, A Jentzen, F Lindner - arXiv preprint arXiv:2211.15641, 2022 - arxiv.org
In this article we investigate blow up phenomena for gradient descent optimization methods
in the training of artificial neural networks (ANNs). Our theoretical analysis is focused on …

A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions

A Jentzen, A Riekert - Zeitschrift für angewandte Mathematik und Physik, 2022 - Springer
In this article we study the stochastic gradient descent (SGD) optimization method in the
training of fully connected feedforward artificial neural networks with ReLU activation. The …

Multi-level monte-carlo gradient methods for stochastic optimization with biased oracles

Y Hu, J Wang, X Chen, N He - arXiv preprint arXiv:2408.11084, 2024 - arxiv.org
We consider stochastic optimization when one only has access to biased stochastic oracles
of the objective and the gradient, and obtaining stochastic gradients with low biases comes …

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

M Hutzenthaler, A Jentzen, K Pohl, A Riekert… - arXiv preprint arXiv …, 2021 - arxiv.org
In many numerical simulations stochastic gradient descent (SGD) type optimization methods
perform very effectively in the training of deep neural networks (DNNs) but till this day it …