Universal approximation and model compression for radial neural networks

B Zhao, I Ganev, R Walters, R Yu… - arXiv preprint arXiv …, 2022 - arxiv.org

Empirical studies of the loss landscape of deep networks have revealed that many local
minima are connected through low-loss valleys. Yet, little is known about the theoretical …

被引用次数：16 相关文章所有 6 个版本

[PDF] neurips.cc

Symmetry teleportation for accelerated optimization

B Zhao, N Dehmamy, R Walters… - Advances in neural …, 2022 - proceedings.neurips.cc

Existing gradient-based optimization methods update parameters locally, in a direction that
minimizes the loss function. We study a different approach, symmetry teleportation, that …

被引用次数：15 相关文章所有 8 个版本

[PDF] arxiv.org

Improving Convergence and Generalization Using Parameter Symmetries

B Zhao, RM Gower, R Walters, R Yu - arXiv preprint arXiv:2305.13404, 2023 - arxiv.org

In many neural networks, different values of the parameters may result in the same loss
value. Parameter space symmetries are loss-invariant transformations that change the …

被引用次数：2 相关文章所有 4 个版本

[PDF] springer.com

A Practical Approach for Employing Tensor Train Decomposition in Edge Devices

M Kokhazadeh, G Keramidas, V Kelefouras… - International Journal of …, 2024 - Springer

Abstract Deep Neural Networks (DNN) have made significant advances in various fields
including speech recognition and image processing. Typically, modern DNNs are both …

被引用次数：1 相关文章所有 5 个版本

[PDF] openreview.net

Charting Flat Minima Using the Conserved Quantities of Gradient Flow

B Zhao, I Ganev, R Walters, R Yu… - NeurIPS 2022 Workshop …, 2022 - openreview.net

Empirical studies have revealed that many minima in the loss landscape of deep learning
are connected and reside on a low-loss valley. We present a general framework for finding …

[PDF] openreview.net

Finding Symmetry in Neural Network Parameter Spaces

B Zhao, N Dehmamy, R Walters, R Yu - openreview.net

Parameter space symmetries, or loss-invariant transformations, are important for
understanding neural networks' loss landscape, training dynamics, and generalization …

Conic Activation Functions

C Fu, LD Cohen - UniReps: 2nd Edition of the Workshop on Unifying … - openreview.net

Most activation functions operate component-wise, which restricts the equivariance of neural
networks to permutations. We introduce Conic Linear Units (CoLU) and generalize the …