Layer sparsity in neural networks

J Lederer - arXiv preprint arXiv:2101.09957, 2021 - arxiv.org

Activation functions shape the outputs of artificial neurons and, therefore, are integral parts
of neural networks in general and deep learning in particular. Some activation functions …

被引用次数：105 相关文章所有 2 个版本

[PDF] arxiv.org

Statistical guarantees for regularized neural networks

M Taheri, F Xie, J Lederer - Neural Networks, 2021 - Elsevier

Neural networks have become standard tools in the analysis of data, but they lack
comprehensive mathematical theories. For example, there are very few statistical …

被引用次数：34 相关文章所有 7 个版本

[PDF] springer.com

Statistical guarantees for sparse deep learning

J Lederer - AStA Advances in Statistical Analysis, 2024 - Springer

Neural networks are becoming increasingly popular in applications, but our mathematical
understanding of their potential and limitations is still limited. In this paper, we further this …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Amortized neural networks for low-latency speech recognition

J Macoskey, GP Strimel, J Su, A Rastrow - arXiv preprint arXiv:2108.01553, 2021 - arxiv.org

We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …

被引用次数：16 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Deep learning-based analysis of true triaxial DEM simulations: Role of fabric and particle aspect ratio

N Irani, M Salimi, P Golestaneh, M Tafili… - Computers and …, 2024 - Elsevier

This study investigates the influence of micro-scale entities such as inherent and induced
fabric anisotropy on the stress–strain behaviour of granular assemblies. In tandem with this …

No spurious local minima: on the optimization landscapes of wide and deep neural networks

J Lederer - 2020 - openreview.net

Empirical studies suggest that wide neural networks are comparably easy to optimize, but
mathematical support for this observation is scarce. In this paper, we analyze the …

被引用次数：8 相关文章

[PDF] arxiv.org

Regularization and reparameterization avoid vanishing gradients in sigmoid-type networks

L Ven, J Lederer - arXiv preprint arXiv:2106.02260, 2021 - arxiv.org

Deep learning requires several design choices, such as the nodes' activation functions and
the widths, types, and arrangements of the layers. One consideration when making these …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Reducing Computational and Statistical Complexity in Machine Learning Through Cardinality Sparsity

A Mohades, J Lederer - arXiv preprint arXiv:2302.08235, 2023 - arxiv.org

High-dimensional data has become ubiquitous across the sciences but causes
computational and statistical challenges. A common approach for dealing with these …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Representations learnt by SGD and Adaptive learning rules: Conditions that vary sparsity and selectivity in neural network

JH Park - arXiv preprint arXiv:2201.11653, 2022 - arxiv.org

From the point of view of the human brain, continual learning can perform various tasks
without mutual interference. An effective way to reduce mutual interference can be found in …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Optimization landscapes of wide deep neural networks are benign

J Lederer - arXiv preprint arXiv:2010.00885, 2020 - arxiv.org

We analyze the optimization landscapes of deep learning with wide networks. We highlight
the importance of constraints for such networks and show that constraint--as well as …

被引用次数：3 相关文章所有 2 个版本