Activation functions in artificial neural networks: A systematic overview

J Lederer - arXiv preprint arXiv:2101.09957, 2021 - arxiv.org
Activation functions shape the outputs of artificial neurons and, therefore, are integral parts
of neural networks in general and deep learning in particular. Some activation functions …

Statistical guarantees for regularized neural networks

M Taheri, F Xie, J Lederer - Neural Networks, 2021 - Elsevier
Neural networks have become standard tools in the analysis of data, but they lack
comprehensive mathematical theories. For example, there are very few statistical …

Statistical guarantees for sparse deep learning

J Lederer - AStA Advances in Statistical Analysis, 2024 - Springer
Neural networks are becoming increasingly popular in applications, but our mathematical
understanding of their potential and limitations is still limited. In this paper, we further this …

Amortized neural networks for low-latency speech recognition

J Macoskey, GP Strimel, J Su, A Rastrow - arXiv preprint arXiv:2108.01553, 2021 - arxiv.org
We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …

[HTML][HTML] Deep learning-based analysis of true triaxial DEM simulations: Role of fabric and particle aspect ratio

N Irani, M Salimi, P Golestaneh, M Tafili… - Computers and …, 2024 - Elsevier
This study investigates the influence of micro-scale entities such as inherent and induced
fabric anisotropy on the stress–strain behaviour of granular assemblies. In tandem with this …

No spurious local minima: on the optimization landscapes of wide and deep neural networks

J Lederer - 2020 - openreview.net
Empirical studies suggest that wide neural networks are comparably easy to optimize, but
mathematical support for this observation is scarce. In this paper, we analyze the …

Regularization and reparameterization avoid vanishing gradients in sigmoid-type networks

L Ven, J Lederer - arXiv preprint arXiv:2106.02260, 2021 - arxiv.org
Deep learning requires several design choices, such as the nodes' activation functions and
the widths, types, and arrangements of the layers. One consideration when making these …

Reducing Computational and Statistical Complexity in Machine Learning Through Cardinality Sparsity

A Mohades, J Lederer - arXiv preprint arXiv:2302.08235, 2023 - arxiv.org
High-dimensional data has become ubiquitous across the sciences but causes
computational and statistical challenges. A common approach for dealing with these …

Representations learnt by SGD and Adaptive learning rules: Conditions that vary sparsity and selectivity in neural network

JH Park - arXiv preprint arXiv:2201.11653, 2022 - arxiv.org
From the point of view of the human brain, continual learning can perform various tasks
without mutual interference. An effective way to reduce mutual interference can be found in …

Optimization landscapes of wide deep neural networks are benign

J Lederer - arXiv preprint arXiv:2010.00885, 2020 - arxiv.org
We analyze the optimization landscapes of deep learning with wide networks. We highlight
the importance of constraints for such networks and show that constraint--as well as …