Benign overfitting without linearity: Neural network classifiers trained by gradient descent for noisy linear data

S Frei, NS Chatterji, P Bartlett - Conference on Learning …, 2022 - proceedings.mlr.press
Benign overfitting, the phenomenon where interpolating models generalize well in the
presence of noisy data, was first observed in neural network models trained with gradient …

Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization

S Frei, G Vardi, P Bartlett… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …

Optimistic rates: A unifying theory for interpolation learning and regularization in linear regression

L Zhou, F Koehler, DJ Sutherland… - ACM/JMS Journal of Data …, 2024 - dl.acm.org
We study a localized notion of uniform convergence known as an “optimistic rate”[,] for linear
regression with Gaussian data. Our refined analysis avoids the hidden constant and …

A non-asymptotic moreau envelope theory for high-dimensional generalized linear models

L Zhou, F Koehler, P Sur… - Advances in Neural …, 2022 - proceedings.neurips.cc
We prove a new generalization bound that shows for any class of linear predictors in
Gaussian space, the Rademacher complexity of the class and the training error under any …

Benign overfitting in deep neural networks under lazy training

Z Zhu, F Liu, G Chrysos, F Locatello… - … on Machine Learning, 2023 - proceedings.mlr.press
This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU
activation functions and proves that when the data distribution is well-separated, DNNs can …

Deep linear networks can benignly overfit when shallow ones do

NS Chatterji, PM Long - Journal of Machine Learning Research, 2023 - jmlr.org
We bound the excess risk of interpolating deep linear networks trained using gradient flow.
In a setting previously used to establish risk bounds for the minimum ℓ2-norm interpolant, we …

Foolish crowds support benign overfitting

NS Chatterji, PM Long - Journal of Machine Learning Research, 2022 - jmlr.org
We prove a lower bound on the excess risk of sparse interpolating procedures for linear
regression with Gaussian data in the overparameterized regime. We apply this result to …

Precise asymptotic generalization for multiclass classification with overparameterized linear models

D Wu, A Sahai - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc
We study the asymptotic generalization of an overparameterized linear model for multiclass
classification under the Gaussian covariates bi-level model introduced in Subramanian et …

Generalization for multiclass classification with overparameterized linear models

V Subramanian, R Arya… - Advances in Neural …, 2022 - proceedings.neurips.cc
Via an overparameterized linear model with Gaussian features, we provide conditions for
good generalization for multiclass classification of minimum-norm interpolating solutions in …

Noisy interpolation learning with shallow univariate relu networks

N Joshi, G Vardi, N Srebro - arXiv preprint arXiv:2307.15396, 2023 - arxiv.org
We study the asymptotic overfitting behavior of interpolation with minimum norm ($\ell_2 $ of
the weights) two-layer ReLU networks for noisy univariate regression. We show that …