Pure and spurious critical points: a geometric study of linear networks

J Kileel, M Trager, J Bruna - Advances in neural information …, 2019 - proceedings.neurips.cc

We study deep neural networks with polynomial activations, particularly their expressive
power. For a fixed architecture and activation degree, a polynomial neural network defines …

被引用次数：83 相关文章所有 10 个版本

[PDF] arxiv.org

Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers

B Bah, H Rauhut, U Terstiege… - … and Inference: A …, 2022 - academic.oup.com

We study the convergence of gradient flows related to learning deep linear neural networks
(where the activation function is the identity map) from data. In this case, the composition of …

被引用次数：73 相关文章所有 6 个版本

[PDF] springer.com

The effect of smooth parametrizations on nonconvex optimization landscapes

E Levin, J Kileel, N Boumal - Mathematical Programming, 2024 - Springer

We develop new tools to study landscapes in nonconvex optimization. Given one
optimization problem, we pair it with another by smoothly parametrizing the domain. This is …

被引用次数：29 相关文章所有 6 个版本

[PDF] oapen.org

[图书][B] Metric algebraic geometry

P Breiding, K Kohn, B Sturmfels - 2024 - library.oapen.org

Metric algebraic geometry combines concepts from algebraic geometry and differential
geometry. Building on classical foundations, it offers practical tools for the 21st century …

被引用次数：13 相关文章所有 6 个版本

[PDF] mlr.press

Critical points and convergence analysis of generative deep linear networks trained with Bures-Wasserstein loss

P Bréchet, K Papagiannouli, J An… - … on Machine Learning, 2023 - proceedings.mlr.press

We consider a deep matrix factorization model of covariance matrices trained with the Bures-
Wasserstein distance. While recent works have made advances in the study of the …

被引用次数：5 相关文章所有 7 个版本

[PDF] mlr.press

Training linear neural networks: Non-local convergence and complexity results

A Eftekhari - International Conference on Machine Learning, 2020 - proceedings.mlr.press

Linear networks provide valuable insights into the workings of neural networks in general.
This paper identifies conditions under which the gradient flow provably trains a linear …

被引用次数：34 相关文章所有 7 个版本

[PDF] springer.com

Convergence of gradient descent for learning linear neural networks

GM Nguegnang, H Rauhut, U Terstiege - Advances in Continuous and …, 2024 - Springer

We study the convergence properties of gradient descent for training deep linear neural
networks, ie, deep matrix factorizations, by extending a previous analysis for the related …

被引用次数：18 相关文章所有 4 个版本

[PDF] siam.org

Geometry of linear convolutional networks

K Kohn, T Merkh, G Montúfar, M Trager - SIAM Journal on Applied Algebra and …, 2022 - SIAM

We study the family of functions that are represented by a linear convolutional network
(LCN). These functions form a semi-algebraic subset of the set of linear maps from input …

被引用次数：21 相关文章所有 9 个版本

[PDF] arxiv.org

Functional dimension of feedforward ReLU neural networks

JE Grigsby, K Lindsey, R Meyerhoff, C Wu - arXiv preprint arXiv …, 2022 - arxiv.org

It is well-known that the parameterized family of functions representable by fully-connected
feedforward neural networks with ReLU activation function is precisely the class of …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

The geometry of memoryless stochastic policy optimization in infinite-horizon POMDPs

J Müller, G Montúfar - arXiv preprint arXiv:2110.07409, 2021 - arxiv.org

We consider the problem of finding the best memoryless stochastic policy for an infinite-
horizon partially observable Markov decision process (POMDP) with finite state and action …

被引用次数：11 相关文章所有 5 个版本