On the expressive power of deep polynomial neural networks

J Kileel, M Trager, J Bruna - Advances in neural information …, 2019 - proceedings.neurips.cc
We study deep neural networks with polynomial activations, particularly their expressive
power. For a fixed architecture and activation degree, a polynomial neural network defines …

Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers

B Bah, H Rauhut, U Terstiege… - … and Inference: A …, 2022 - academic.oup.com
We study the convergence of gradient flows related to learning deep linear neural networks
(where the activation function is the identity map) from data. In this case, the composition of …

The effect of smooth parametrizations on nonconvex optimization landscapes

E Levin, J Kileel, N Boumal - Mathematical Programming, 2024 - Springer
We develop new tools to study landscapes in nonconvex optimization. Given one
optimization problem, we pair it with another by smoothly parametrizing the domain. This is …

[图书][B] Metric algebraic geometry

P Breiding, K Kohn, B Sturmfels - 2024 - library.oapen.org
Metric algebraic geometry combines concepts from algebraic geometry and differential
geometry. Building on classical foundations, it offers practical tools for the 21st century …

Critical points and convergence analysis of generative deep linear networks trained with Bures-Wasserstein loss

P Bréchet, K Papagiannouli, J An… - … on Machine Learning, 2023 - proceedings.mlr.press
We consider a deep matrix factorization model of covariance matrices trained with the Bures-
Wasserstein distance. While recent works have made advances in the study of the …

Training linear neural networks: Non-local convergence and complexity results

A Eftekhari - International Conference on Machine Learning, 2020 - proceedings.mlr.press
Linear networks provide valuable insights into the workings of neural networks in general.
This paper identifies conditions under which the gradient flow provably trains a linear …

Convergence of gradient descent for learning linear neural networks

GM Nguegnang, H Rauhut, U Terstiege - Advances in Continuous and …, 2024 - Springer
We study the convergence properties of gradient descent for training deep linear neural
networks, ie, deep matrix factorizations, by extending a previous analysis for the related …

Geometry of linear convolutional networks

K Kohn, T Merkh, G Montúfar, M Trager - SIAM Journal on Applied Algebra and …, 2022 - SIAM
We study the family of functions that are represented by a linear convolutional network
(LCN). These functions form a semi-algebraic subset of the set of linear maps from input …

Functional dimension of feedforward ReLU neural networks

JE Grigsby, K Lindsey, R Meyerhoff, C Wu - arXiv preprint arXiv …, 2022 - arxiv.org
It is well-known that the parameterized family of functions representable by fully-connected
feedforward neural networks with ReLU activation function is precisely the class of …

The geometry of memoryless stochastic policy optimization in infinite-horizon POMDPs

J Müller, G Montúfar - arXiv preprint arXiv:2110.07409, 2021 - arxiv.org
We consider the problem of finding the best memoryless stochastic policy for an infinite-
horizon partially observable Markov decision process (POMDP) with finite state and action …