Activation functions in artificial neural networks: A systematic overview
J Lederer - arXiv preprint arXiv:2101.09957, 2021 - arxiv.org
Activation functions shape the outputs of artificial neurons and, therefore, are integral parts
of neural networks in general and deep learning in particular. Some activation functions …
of neural networks in general and deep learning in particular. Some activation functions …
Statistical guarantees for regularized neural networks
Neural networks have become standard tools in the analysis of data, but they lack
comprehensive mathematical theories. For example, there are very few statistical …
comprehensive mathematical theories. For example, there are very few statistical …
Statistical guarantees for sparse deep learning
J Lederer - AStA Advances in Statistical Analysis, 2024 - Springer
Neural networks are becoming increasingly popular in applications, but our mathematical
understanding of their potential and limitations is still limited. In this paper, we further this …
understanding of their potential and limitations is still limited. In this paper, we further this …
Amortized neural networks for low-latency speech recognition
We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …
[HTML][HTML] Deep learning-based analysis of true triaxial DEM simulations: Role of fabric and particle aspect ratio
This study investigates the influence of micro-scale entities such as inherent and induced
fabric anisotropy on the stress–strain behaviour of granular assemblies. In tandem with this …
fabric anisotropy on the stress–strain behaviour of granular assemblies. In tandem with this …
No spurious local minima: on the optimization landscapes of wide and deep neural networks
J Lederer - 2020 - openreview.net
Empirical studies suggest that wide neural networks are comparably easy to optimize, but
mathematical support for this observation is scarce. In this paper, we analyze the …
mathematical support for this observation is scarce. In this paper, we analyze the …
Regularization and reparameterization avoid vanishing gradients in sigmoid-type networks
L Ven, J Lederer - arXiv preprint arXiv:2106.02260, 2021 - arxiv.org
Deep learning requires several design choices, such as the nodes' activation functions and
the widths, types, and arrangements of the layers. One consideration when making these …
the widths, types, and arrangements of the layers. One consideration when making these …
Reducing Computational and Statistical Complexity in Machine Learning Through Cardinality Sparsity
High-dimensional data has become ubiquitous across the sciences but causes
computational and statistical challenges. A common approach for dealing with these …
computational and statistical challenges. A common approach for dealing with these …
Representations learnt by SGD and Adaptive learning rules: Conditions that vary sparsity and selectivity in neural network
JH Park - arXiv preprint arXiv:2201.11653, 2022 - arxiv.org
From the point of view of the human brain, continual learning can perform various tasks
without mutual interference. An effective way to reduce mutual interference can be found in …
without mutual interference. An effective way to reduce mutual interference can be found in …
Optimization landscapes of wide deep neural networks are benign
J Lederer - arXiv preprint arXiv:2010.00885, 2020 - arxiv.org
We analyze the optimization landscapes of deep learning with wide networks. We highlight
the importance of constraints for such networks and show that constraint--as well as …
the importance of constraints for such networks and show that constraint--as well as …