Benign overfitting without linearity: Neural network classifiers trained by gradient descent...

Y Cao, Z Chen, M Belkin, Q Gu - Advances in neural …, 2022 - proceedings.neurips.cc

Modern neural networks often have great expressive power and can be trained to overfit the
training data, while still achieving a good test performance. This phenomenon is referred to …

被引用次数：94 相关文章所有 8 个版本

[PDF] neurips.cc

Benign, tempered, or catastrophic: Toward a refined taxonomy of overfitting

N Mallinar, J Simon, A Abedsoltan… - Advances in …, 2022 - proceedings.neurips.cc

The practical success of overparameterized neural networks has motivated the recent
scientific study of\emph {interpolating methods}--learning methods which are able fit their …

被引用次数：57 相关文章所有 7 个版本

[PDF] mlr.press

Benign overfitting in two-layer ReLU convolutional neural networks

Y Kou, Z Chen, Y Chen, Q Gu - International Conference on …, 2023 - proceedings.mlr.press

Modern deep learning models with great expressive power can be trained to overfit the
training data but still generalize well. This phenomenon is referred to as benign overfitting …

被引用次数：26 相关文章所有 7 个版本

[PDF] neurips.cc

Implicit bias of gradient descent for two-layer reLU and leaky reLU networks on nearly-orthogonal data

Y Kou, Z Chen, Q Gu - Advances in Neural Information …, 2024 - proceedings.neurips.cc

The implicit bias towards solutions with favorable properties is believed to be a key reason
why neural networks trained by gradient-based optimization can generalize well. While the …

被引用次数：11 相关文章所有 7 个版本

[PDF] neurips.cc

Why does sharpness-aware minimization generalize better than SGD?

Z Chen, J Zhang, Y Kou, X Chen… - Advances in neural …, 2024 - proceedings.neurips.cc

The challenge of overfitting, in which the model memorizes the training data and fails to
generalize to test data, has become increasingly significant in the training of large neural …

被引用次数：11 相关文章所有 7 个版本

[PDF] mlr.press

Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization

S Frei, G Vardi, P Bartlett… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press

Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …

被引用次数：28 相关文章所有 5 个版本

[PDF] neurips.cc

The double-edged sword of implicit bias: Generalization vs. robustness in relu networks

S Frei, G Vardi, P Bartlett… - Advances in Neural …, 2024 - proceedings.neurips.cc

In this work, we study the implications of the implicit bias of gradient flow on generalization
and adversarial robustness in ReLU networks. We focus on a setting where the data …

被引用次数：13 相关文章所有 7 个版本

[PDF] arxiv.org

Implicit bias in leaky relu networks trained on high-dimensional data

S Frei, G Vardi, PL Bartlett, N Srebro, W Hu - arXiv preprint arXiv …, 2022 - arxiv.org

The implicit biases of gradient-based optimization algorithms are conjectured to be a major
factor in the success of modern deep learning. In this work, we investigate the implicit bias of …

被引用次数：42 相关文章所有 5 个版本

[PDF] mlr.press

The benefits of mixup for feature learning

D Zou, Y Cao, Y Li, Q Gu - International Conference on …, 2023 - proceedings.mlr.press

Mixup, a simple data augmentation method that randomly mixes two data points via linear
interpolation, has been extensively applied in various deep learning applications to gain …

被引用次数：19 相关文章所有 9 个版本

[PDF] mlr.press

The implicit bias of benign overfitting

O Shamir - Conference on Learning Theory, 2022 - proceedings.mlr.press

The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data
while attaining low expected loss, has received much attention in recent years, but still …

被引用次数：37 相关文章所有 4 个版本