The power of preconditioning in overparameterized low-rank matrix sensing
Abstract We propose $\textsf {ScaledGD ($\lambda $)} $, a preconditioned gradient descent
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …
Over-parameterization exponentially slows down gradient descent for learning a single neuron
We revisit the canonical problem of learning a single neuron with ReLU activation under
Gaussian input with square loss. We particularly focus on the over-parameterization setting …
Gaussian input with square loss. We particularly focus on the over-parameterization setting …
Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
Improved global guarantees for the nonconvex burer–monteiro factorization via rank overparameterization
RY Zhang - Mathematical Programming, 2024 - Springer
We consider minimizing a twice-differentiable, L-smooth, and\(\mu\)-strongly convex
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …
How over-parameterization slows down gradient descent in matrix sensing: The curses of symmetry and initialization
This paper rigorously shows how over-parameterization changes the convergence
behaviors of gradient descent (GD) for the matrix sensing problem, where the goal is to …
behaviors of gradient descent (GD) for the matrix sensing problem, where the goal is to …
Simpler Gradient Methods for Blind Super-Resolution with Lower Iteration Complexity
J Li, W Cui, X Zhang - IEEE Transactions on Signal Processing, 2024 - ieeexplore.ieee.org
We study the problem of blind super-resolution, which can be formulated as a low-rank
matrix recovery problem via vectorized Hankel lift (VHL). The previous gradient descent …
matrix recovery problem via vectorized Hankel lift (VHL). The previous gradient descent …
Gradient descent with adaptive stepsize converges (nearly) linearly under fourth-order growth
A prevalent belief among optimization specialists is that linear convergence of gradient
descent is contingent on the function growing quadratically away from its minimizers. In this …
descent is contingent on the function growing quadratically away from its minimizers. In this …
Fast and Provable Simultaneous Blind Super-Resolution and Demixing for Point Source Signals: Scaled Gradient Descent without Regularization
J Chen - arXiv preprint arXiv:2407.09900, 2024 - arxiv.org
We address the problem of simultaneously recovering a sequence of point source signals
from observations limited to the low-frequency end of the spectrum of their summed …
from observations limited to the low-frequency end of the spectrum of their summed …
Can Learning Be Explained By Local Optimality In Low-rank Matrix Recovery?
We explore the local landscape of low-rank matrix recovery, aiming to reconstruct a $
d_1\times d_2 $ matrix with rank $ r $ from $ m $ linear measurements, some potentially …
d_1\times d_2 $ matrix with rank $ r $ from $ m $ linear measurements, some potentially …
Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
We study the convergence rate of first-order methods for rectangular matrix factorization,
which is a canonical nonconvex optimization problem. Specifically, given a rank-$ r $ matrix …
which is a canonical nonconvex optimization problem. Specifically, given a rank-$ r $ matrix …