The shape of learning curves: a review
Learning curves provide insight into the dependence of a learner's generalization
performance on the training set size. This important tool can be used for model selection, to …
performance on the training set size. This important tool can be used for model selection, to …
A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning
Y Dar, V Muthukumar, RG Baraniuk - arXiv preprint arXiv:2109.02355, 2021 - arxiv.org
The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …
questions that challenge the longstanding dogma of the field. One of the most important …
Bayesian deep learning and a probabilistic perspective of generalization
AG Wilson, P Izmailov - Advances in neural information …, 2020 - proceedings.neurips.cc
The key distinguishing property of a Bayesian approach is marginalization, rather than using
a single setting of weights. Bayesian marginalization can particularly improve the accuracy …
a single setting of weights. Bayesian marginalization can particularly improve the accuracy …
Towards understanding grokking: An effective theory of representation learning
We aim to understand grokking, a phenomenon where models generalize long after
overfitting their training set. We present both a microscopic analysis anchored by an effective …
overfitting their training set. We present both a microscopic analysis anchored by an effective …
Direct parameterization of lipschitz-bounded deep networks
R Wang, I Manchester - International Conference on …, 2023 - proceedings.mlr.press
This paper introduces a new parameterization of deep neural networks (both fully-connected
and convolutional) with guaranteed $\ell^ 2$ Lipschitz bounds, ie limited sensitivity to input …
and convolutional) with guaranteed $\ell^ 2$ Lipschitz bounds, ie limited sensitivity to input …
[HTML][HTML] Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks
A theoretical understanding of generalization remains an open problem for many machine
learning models, including deep networks where overparameterization leads to better …
learning models, including deep networks where overparameterization leads to better …
Learning Curves for Decision Making in Supervised Machine Learning--A Survey
F Mohr, JN van Rijn - arXiv preprint arXiv:2201.12150, 2022 - arxiv.org
Learning curves are a concept from social sciences that has been adopted in the context of
machine learning to assess the performance of a learning algorithm with respect to a certain …
machine learning to assess the performance of a learning algorithm with respect to a certain …
Data feedback loops: Model-driven amplification of dataset biases
R Taori, T Hashimoto - International Conference on Machine …, 2023 - proceedings.mlr.press
Datasets scraped from the internet have been critical to large-scale machine learning. Yet,
its success puts the utility of future internet-derived datasets at potential risk, as model …
its success puts the utility of future internet-derived datasets at potential risk, as model …
Benign overfitting of constant-stepsize sgd for linear regression
There is an increasing realization that algorithmic inductive biases are central in preventing
overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized …
overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized …
Shape matters: Understanding the implicit bias of the noise covariance
The noise in stochastic gradient descent (SGD) provides a crucial implicit regularization
effect for training overparameterized models. Prior theoretical work largely focuses on …
effect for training overparameterized models. Prior theoretical work largely focuses on …