The shape of learning curves: a review
Learning curves provide insight into the dependence of a learner's generalization
performance on the training set size. This important tool can be used for model selection, to …
performance on the training set size. This important tool can be used for model selection, to …
A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning
The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …
questions that challenge the longstanding dogma of the field. One of the most important …
Cross-entropy loss functions: Theoretical analysis and applications
Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …
User-friendly introduction to PAC-Bayes bounds
P Alquier - Foundations and Trends® in Machine Learning, 2024 - nowpublishers.com
Aggregated predictors are obtained by making a set of basic predictors vote according to
some weights, that is, to some probability distribution. Randomized predictors are obtained …
some weights, that is, to some probability distribution. Randomized predictors are obtained …
Deep learning: a statistical viewpoint
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
Long-tail learning via logit adjustment
Real-world classification problems typically exhibit an imbalanced or long-tailed label
distribution, wherein many labels are associated with only a few samples. This poses a …
distribution, wherein many labels are associated with only a few samples. This poses a …
Theoretically principled trade-off between robustness and accuracy
We identify a trade-off between robustness and accuracy that serves as a guiding principle
in the design of defenses against adversarial examples. Although this problem has been …
in the design of defenses against adversarial examples. Although this problem has been …
Two-stage learning to defer with multiple experts
We study a two-stage scenario for learning to defer with multiple experts, which is crucial in
practice for many applications. In this scenario, a predictor is derived in a first stage by …
practice for many applications. In this scenario, a predictor is derived in a first stage by …
Does label smoothing mitigate label noise?
Label smoothing is commonly used in training deep learning models, wherein one-hot
training labels are mixed with uniform label vectors. Empirically, smoothing has been shown …
training labels are mixed with uniform label vectors. Empirically, smoothing has been shown …
Image classification with deep learning in the presence of noisy labels: A survey
Image classification systems recently made a giant leap with the advancement of deep
neural networks. However, these systems require an excessive amount of labeled data to be …
neural networks. However, these systems require an excessive amount of labeled data to be …