Implicit regularization in hierarchical tensor factorization and deep convolutional neural networks
In the pursuit of explaining implicit regularization in deep learning, prominent focus was
given to matrix and tensor factorizations, which correspond to simplified neural networks. It …
given to matrix and tensor factorizations, which correspond to simplified neural networks. It …
The effect of smooth parametrizations on nonconvex optimization landscapes
We develop new tools to study landscapes in nonconvex optimization. Given one
optimization problem, we pair it with another by smoothly parametrizing the domain. This is …
optimization problem, we pair it with another by smoothly parametrizing the domain. This is …
[图书][B] Metric algebraic geometry
P Breiding, K Kohn, B Sturmfels - 2024 - library.oapen.org
Metric algebraic geometry combines concepts from algebraic geometry and differential
geometry. Building on classical foundations, it offers practical tools for the 21st century …
geometry. Building on classical foundations, it offers practical tools for the 21st century …
Critical points and convergence analysis of generative deep linear networks trained with Bures-Wasserstein loss
We consider a deep matrix factorization model of covariance matrices trained with the Bures-
Wasserstein distance. While recent works have made advances in the study of the …
Wasserstein distance. While recent works have made advances in the study of the …
On the minimal algebraic complexity of the rank-one approximation problem for general inner products
K Kozhasov, A Muniz, Y Qi, L Sodomaco - arXiv preprint arXiv:2309.15105, 2023 - arxiv.org
We study the algebraic complexity of Euclidean distance minimization from a generic tensor
to a variety of rank-one tensors. The Euclidean Distance (ED) degree of the Segre-Veronese …
to a variety of rank-one tensors. The Euclidean Distance (ED) degree of the Segre-Veronese …
Understanding deep learning via notions of rank
N Razin - arXiv preprint arXiv:2408.02111, 2024 - arxiv.org
Despite the extreme popularity of deep learning in science and industry, its formal
understanding is limited. This thesis puts forth notions of rank as key for developing a theory …
understanding is limited. This thesis puts forth notions of rank as key for developing a theory …
Function space and critical points of linear convolutional networks
We study the geometry of linear networks with one-dimensional convolutional layers. The
function spaces of these networks can be identified with semialgebraic families of …
function spaces of these networks can be identified with semialgebraic families of …
Side effects of learning from low-dimensional data embedded in a Euclidean space
The low-dimensional manifold hypothesis posits that the data found in many applications,
such as those involving natural images, lie (approximately) on low-dimensional manifolds …
such as those involving natural images, lie (approximately) on low-dimensional manifolds …
The geometry of the deep linear network
G Menon - arXiv preprint arXiv:2411.09004, 2024 - arxiv.org
This article provides an expository account of training dynamics in the Deep Linear Network
(DLN) from the perspective of the geometric theory of dynamical systems. Rigorous results …
(DLN) from the perspective of the geometric theory of dynamical systems. Rigorous results …
Geometry of lightning self-attention: Identifiability and dimension
NW Henry, GL Marchetti, K Kohn - arXiv preprint arXiv:2408.17221, 2024 - arxiv.org
We consider function spaces defined by self-attention networks without normalization, and
theoretically analyze their geometry. Since these networks are polynomial, we rely on tools …
theoretically analyze their geometry. Since these networks are polynomial, we rely on tools …