Benign overfitting in two-layer convolutional neural networks
Modern neural networks often have great expressive power and can be trained to overfit the
training data, while still achieving a good test performance. This phenomenon is referred to …
training data, while still achieving a good test performance. This phenomenon is referred to …
Benign, tempered, or catastrophic: Toward a refined taxonomy of overfitting
The practical success of overparameterized neural networks has motivated the recent
scientific study of\emph {interpolating methods}--learning methods which are able fit their …
scientific study of\emph {interpolating methods}--learning methods which are able fit their …
Benign overfitting in two-layer ReLU convolutional neural networks
Modern deep learning models with great expressive power can be trained to overfit the
training data but still generalize well. This phenomenon is referred to as benign overfitting …
training data but still generalize well. This phenomenon is referred to as benign overfitting …
Implicit bias of gradient descent for two-layer reLU and leaky reLU networks on nearly-orthogonal data
The implicit bias towards solutions with favorable properties is believed to be a key reason
why neural networks trained by gradient-based optimization can generalize well. While the …
why neural networks trained by gradient-based optimization can generalize well. While the …
Why does sharpness-aware minimization generalize better than SGD?
The challenge of overfitting, in which the model memorizes the training data and fails to
generalize to test data, has become increasingly significant in the training of large neural …
generalize to test data, has become increasingly significant in the training of large neural …
Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization
Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …
The double-edged sword of implicit bias: Generalization vs. robustness in relu networks
In this work, we study the implications of the implicit bias of gradient flow on generalization
and adversarial robustness in ReLU networks. We focus on a setting where the data …
and adversarial robustness in ReLU networks. We focus on a setting where the data …
Implicit bias in leaky relu networks trained on high-dimensional data
The implicit biases of gradient-based optimization algorithms are conjectured to be a major
factor in the success of modern deep learning. In this work, we investigate the implicit bias of …
factor in the success of modern deep learning. In this work, we investigate the implicit bias of …
The benefits of mixup for feature learning
Mixup, a simple data augmentation method that randomly mixes two data points via linear
interpolation, has been extensively applied in various deep learning applications to gain …
interpolation, has been extensively applied in various deep learning applications to gain …
The implicit bias of benign overfitting
O Shamir - Conference on Learning Theory, 2022 - proceedings.mlr.press
The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data
while attaining low expected loss, has received much attention in recent years, but still …
while attaining low expected loss, has received much attention in recent years, but still …