Sampling weights of deep neural networks

EL Bolager, I Burak, C Datar, Q Sun… - Advances in Neural …, 2023 - proceedings.neurips.cc
We introduce a probability distribution, combined with an efficient sampling algorithm, for
weights and biases of fully-connected neural networks. In a supervised learning context, no …

Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

M Haas, D Holzmüller, U Luxburg… - Advances in Neural …, 2024 - proceedings.neurips.cc
The success of over-parameterized neural networks trained to near-zero training error has
caused great interest in the phenomenon of benign overfitting, where estimators are …

Why shallow networks struggle with approximating and learning high frequency: A numerical study

S Zhang, H Zhao, Y Zhong, H Zhou - arXiv preprint arXiv:2306.17301, 2023 - arxiv.org
In this work, a comprehensive numerical study involving analysis and experiments shows
why a two-layer neural network has difficulties handling high frequencies in approximation …

On the omnipresence of spurious local minima in certain neural network training problems

C Christof, J Kowalczyk - Constructive Approximation, 2023 - Springer
We study the loss landscape of training problems for deep artificial neural networks with a
one-dimensional real output whose activation functions contain an affine segment and …

How to Train an Artificial Neural Network to Predict Higher Heating Values of Biofuel

A Matveeva, A Bychkov - Energies, 2022 - mdpi.com
Plant biomass is one of the most promising and easy-to-use sources of renewable energy.
Direct determination of higher heating values of fuel in an adiabatic calorimeter is too …

When Are Bias-Free ReLU Networks Like Linear Networks?

Y Zhang, A Saxe, PE Latham - arXiv preprint arXiv:2406.12615, 2024 - arxiv.org
We investigate the expressivity and learning dynamics of bias-free ReLU networks. We firstly
show that two-layer bias-free ReLU networks have limited expressivity: the only odd function …

Generative Feature Training of Thin 2-Layer Networks

J Hertrich, S Neumayer - arXiv preprint arXiv:2411.06848, 2024 - arxiv.org
We consider the approximation of functions by 2-layer neural networks with a small number
of hidden weights based on the squared loss and small datasets. Due to the highly non …

Critical point-finding methods reveal gradient-flat regions of deep network losses

CG Frye, J Simon, NS Wadia, A Ligeralde… - Neural …, 2021 - direct.mit.edu
Despite the fact that the loss functions of deep neural networks are highly nonconvex,
gradient-based optimization algorithms converge to approximately the same performance …

Regression from linear models to neural networks: double descent, active learning, and sampling

D Holzmüller - 2023 - elib.uni-stuttgart.de
Regression, that is, the approximation of functions from (noisy) data, is a ubiquitous task in
machine learning and beyond. In this thesis, we study regression in three different settings …

Persistent Neurons

Y Min - arXiv preprint arXiv:2007.01419, 2020 - arxiv.org
Neural networks (NN)-based learning algorithms are strongly affected by the choices of
initialization and data distribution. Different optimization strategies have been proposed for …