Robust training under label noise by over-parameterization
Recently, over-parameterized deep networks, with increasingly more network parameters
than training samples, have dominated the performances of modern machine learning …
than training samples, have dominated the performances of modern machine learning …
[HTML][HTML] Do we really need a new theory to understand over-parameterization?
This century saw an unprecedented increase of public and private investments in Artificial
Intelligence (AI) and especially in (Deep) Machine Learning (ML). This led to breakthroughs …
Intelligence (AI) and especially in (Deep) Machine Learning (ML). This led to breakthroughs …
Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
[PDF][PDF] Smoothing the edges: A general framework for smooth optimization in sparse regularization using Hadamard overparametrization
This paper presents a framework for smooth optimization of objectives with ℓq and ℓp, q
regularization for (structured) sparsity. Finding solutions to these non-smooth and possibly …
regularization for (structured) sparsity. Finding solutions to these non-smooth and possibly …
Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification
We consider using gradient descent to minimize the nonconvex function f (X)= ϕ (XX T) over
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …
Behind the scenes of gradient descent: A trajectory analysis via basis function decomposition
This work analyzes the solution trajectory of gradient-based algorithms via a novel basis
function decomposition. We show that, although solution trajectories of gradient-based …
function decomposition. We show that, although solution trajectories of gradient-based …
Mitigating label noise through data ambiguation
J Lienen, E Hüllermeier - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Label noise poses an important challenge in machine learning, especially in deep learning,
in which large models with high expressive power dominate the field. Models of that kind are …
in which large models with high expressive power dominate the field. Models of that kind are …
Efficient compression of overparameterized deep models through low-dimensional learning dynamics
Overparameterized models have proven to be powerful tools for solving various machine
learning tasks. However, overparameterization often leads to a substantial increase in …
learning tasks. However, overparameterization often leads to a substantial increase in …
[PDF][PDF] On the Optimization Landscape of Burer-Monteiro Factorization: When do Global Solutions Correspond to Ground Truth?
In low-rank matrix recovery, the goal is to recover a low-rank matrix, given a limited number
of linear and possibly noisy measurements. Low-rank matrix recovery is typically solved via …
of linear and possibly noisy measurements. Low-rank matrix recovery is typically solved via …
Label Noise: Ignorance Is Bliss
We establish a new theoretical framework for learning under multi-class, instance-
dependent label noise. This framework casts learning with label noise as a form of domain …
dependent label noise. This framework casts learning with label noise as a form of domain …