Robust training under label noise by over-parameterization

S Liu, Z Zhu, Q Qu, C You - International Conference on …, 2022 - proceedings.mlr.press
Recently, over-parameterized deep networks, with increasingly more network parameters
than training samples, have dominated the performances of modern machine learning …

[HTML][HTML] Do we really need a new theory to understand over-parameterization?

L Oneto, S Ridella, D Anguita - Neurocomputing, 2023 - Elsevier
This century saw an unprecedented increase of public and private investments in Artificial
Intelligence (AI) and especially in (Deep) Machine Learning (ML). This led to breakthroughs …

Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization

J Ma, S Fattahi - Journal of Machine Learning Research, 2023 - jmlr.org
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …

[PDF][PDF] Smoothing the edges: A general framework for smooth optimization in sparse regularization using Hadamard overparametrization

C Kolb, CL Müller, B Bischl… - arXiv preprint arXiv …, 2023 - researchgate.net
This paper presents a framework for smooth optimization of objectives with ℓq and ℓp, q
regularization for (structured) sparsity. Finding solutions to these non-smooth and possibly …

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

G Zhang, S Fattahi, RY Zhang - Journal of Machine Learning Research, 2023 - jmlr.org
We consider using gradient descent to minimize the nonconvex function f (X)= ϕ (XX T) over
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …

Behind the scenes of gradient descent: A trajectory analysis via basis function decomposition

J Ma, L Guo, S Fattahi - arXiv preprint arXiv:2210.00346, 2022 - arxiv.org
This work analyzes the solution trajectory of gradient-based algorithms via a novel basis
function decomposition. We show that, although solution trajectories of gradient-based …

Mitigating label noise through data ambiguation

J Lienen, E Hüllermeier - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Label noise poses an important challenge in machine learning, especially in deep learning,
in which large models with high expressive power dominate the field. Models of that kind are …

Efficient compression of overparameterized deep models through low-dimensional learning dynamics

SM Kwon, Z Zhang, D Song, L Balzano… - arXiv preprint arXiv …, 2023 - arxiv.org
Overparameterized models have proven to be powerful tools for solving various machine
learning tasks. However, overparameterization often leads to a substantial increase in …

[PDF][PDF] On the Optimization Landscape of Burer-Monteiro Factorization: When do Global Solutions Correspond to Ground Truth?

J Ma, S Fattahi - arXiv preprint arXiv:2302.10963, 2023 - optimization-online.org
In low-rank matrix recovery, the goal is to recover a low-rank matrix, given a limited number
of linear and possibly noisy measurements. Low-rank matrix recovery is typically solved via …

Label Noise: Ignorance Is Bliss

Y Zhu, J Zhang, A Gangrade, C Scott - arXiv preprint arXiv:2411.00079, 2024 - arxiv.org
We establish a new theoretical framework for learning under multi-class, instance-
dependent label noise. This framework casts learning with label noise as a form of domain …