Benign overfitting without linearity: Neural network classifiers trained by gradient descent for noisy linear data
Benign overfitting, the phenomenon where interpolating models generalize well in the
presence of noisy data, was first observed in neural network models trained with gradient …
presence of noisy data, was first observed in neural network models trained with gradient …
Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization
Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …
Optimistic rates: A unifying theory for interpolation learning and regularization in linear regression
We study a localized notion of uniform convergence known as an “optimistic rate”[,] for linear
regression with Gaussian data. Our refined analysis avoids the hidden constant and …
regression with Gaussian data. Our refined analysis avoids the hidden constant and …
A non-asymptotic moreau envelope theory for high-dimensional generalized linear models
We prove a new generalization bound that shows for any class of linear predictors in
Gaussian space, the Rademacher complexity of the class and the training error under any …
Gaussian space, the Rademacher complexity of the class and the training error under any …
Benign overfitting in deep neural networks under lazy training
This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU
activation functions and proves that when the data distribution is well-separated, DNNs can …
activation functions and proves that when the data distribution is well-separated, DNNs can …
Deep linear networks can benignly overfit when shallow ones do
NS Chatterji, PM Long - Journal of Machine Learning Research, 2023 - jmlr.org
We bound the excess risk of interpolating deep linear networks trained using gradient flow.
In a setting previously used to establish risk bounds for the minimum ℓ2-norm interpolant, we …
In a setting previously used to establish risk bounds for the minimum ℓ2-norm interpolant, we …
Foolish crowds support benign overfitting
NS Chatterji, PM Long - Journal of Machine Learning Research, 2022 - jmlr.org
We prove a lower bound on the excess risk of sparse interpolating procedures for linear
regression with Gaussian data in the overparameterized regime. We apply this result to …
regression with Gaussian data in the overparameterized regime. We apply this result to …
Precise asymptotic generalization for multiclass classification with overparameterized linear models
We study the asymptotic generalization of an overparameterized linear model for multiclass
classification under the Gaussian covariates bi-level model introduced in Subramanian et …
classification under the Gaussian covariates bi-level model introduced in Subramanian et …
Generalization for multiclass classification with overparameterized linear models
V Subramanian, R Arya… - Advances in Neural …, 2022 - proceedings.neurips.cc
Via an overparameterized linear model with Gaussian features, we provide conditions for
good generalization for multiclass classification of minimum-norm interpolating solutions in …
good generalization for multiclass classification of minimum-norm interpolating solutions in …
Noisy interpolation learning with shallow univariate relu networks
We study the asymptotic overfitting behavior of interpolation with minimum norm ($\ell_2 $ of
the weights) two-layer ReLU networks for noisy univariate regression. We show that …
the weights) two-layer ReLU networks for noisy univariate regression. We show that …