The power of preconditioning in overparameterized low-rank matrix sensing

X Xu, Y Shen, Y Chi, C Ma - International Conference on …, 2023 - proceedings.mlr.press
Abstract We propose $\textsf {ScaledGD ($\lambda $)} $, a preconditioned gradient descent
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …

Over-parameterization exponentially slows down gradient descent for learning a single neuron

W Xu, S Du - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
We revisit the canonical problem of learning a single neuron with ReLU activation under
Gaussian input with square loss. We particularly focus on the over-parameterization setting …

Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization

J Ma, S Fattahi - Journal of Machine Learning Research, 2023 - jmlr.org
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …

Improved global guarantees for the nonconvex burer–monteiro factorization via rank overparameterization

RY Zhang - Mathematical Programming, 2024 - Springer
We consider minimizing a twice-differentiable, L-smooth, and\(\mu\)-strongly convex
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …

How over-parameterization slows down gradient descent in matrix sensing: The curses of symmetry and initialization

N Xiong, L Ding, SS Du - arXiv preprint arXiv:2310.01769, 2023 - arxiv.org
This paper rigorously shows how over-parameterization changes the convergence
behaviors of gradient descent (GD) for the matrix sensing problem, where the goal is to …

Simpler Gradient Methods for Blind Super-Resolution with Lower Iteration Complexity

J Li, W Cui, X Zhang - IEEE Transactions on Signal Processing, 2024 - ieeexplore.ieee.org
We study the problem of blind super-resolution, which can be formulated as a low-rank
matrix recovery problem via vectorized Hankel lift (VHL). The previous gradient descent …

Gradient descent with adaptive stepsize converges (nearly) linearly under fourth-order growth

D Davis, D Drusvyatskiy, L Jiang - arXiv preprint arXiv:2409.19791, 2024 - arxiv.org
A prevalent belief among optimization specialists is that linear convergence of gradient
descent is contingent on the function growing quadratically away from its minimizers. In this …

Fast and Provable Simultaneous Blind Super-Resolution and Demixing for Point Source Signals: Scaled Gradient Descent without Regularization

J Chen - arXiv preprint arXiv:2407.09900, 2024 - arxiv.org
We address the problem of simultaneously recovering a sequence of point source signals
from observations limited to the low-frequency end of the spectrum of their summed …

Can Learning Be Explained By Local Optimality In Low-rank Matrix Recovery?

J Ma, S Fattahi - arXiv preprint arXiv:2302.10963, 2023 - arxiv.org
We explore the local landscape of low-rank matrix recovery, aiming to reconstruct a $
d_1\times d_2 $ matrix with rank $ r $ from $ m $ linear measurements, some potentially …

Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks

Z Xu, Y Wang, T Zhao, R Ward, M Tao - arXiv preprint arXiv:2410.09640, 2024 - arxiv.org
We study the convergence rate of first-order methods for rectangular matrix factorization,
which is a canonical nonconvex optimization problem. Specifically, given a rank-$ r $ matrix …