- 学术资源搜索

The shape of learning curves: a review

T Viering, M Loog - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Learning curves provide insight into the dependence of a learner's generalization
performance on the training set size. This important tool can be used for model selection, to …

被引用次数：172 相关文章所有 10 个版本

[PDF] arxiv.org

A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning

Y Dar, V Muthukumar, RG Baraniuk - arXiv preprint arXiv:2109.02355, 2021 - arxiv.org

The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …

被引用次数：87 相关文章所有 2 个版本

[PDF] mlr.press

Cross-entropy loss functions: Theoretical analysis and applications

A Mao, M Mohri, Y Zhong - International conference on …, 2023 - proceedings.mlr.press

Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …

被引用次数：328 相关文章所有 7 个版本

[PDF] nowpublishers.com

User-friendly introduction to PAC-Bayes bounds

P Alquier - Foundations and Trends® in Machine Learning, 2024 - nowpublishers.com

Aggregated predictors are obtained by making a set of basic predictors vote according to
some weights, that is, to some probability distribution. Randomized predictors are obtained …

被引用次数：206 相关文章所有 6 个版本

[PDF] cambridge.org

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

被引用次数：373 相关文章所有 12 个版本

[PDF] arxiv.org

Long-tail learning via logit adjustment

AK Menon, S Jayasumana, AS Rawat, H Jain… - arXiv preprint arXiv …, 2020 - arxiv.org

Real-world classification problems typically exhibit an imbalanced or long-tailed label
distribution, wherein many labels are associated with only a few samples. This poses a …

被引用次数：808 相关文章所有 12 个版本

[PDF] mlr.press

Theoretically principled trade-off between robustness and accuracy

H Zhang, Y Yu, J Jiao, E Xing… - International …, 2019 - proceedings.mlr.press

We identify a trade-off between robustness and accuracy that serves as a guiding principle
in the design of defenses against adversarial examples. Although this problem has been …

被引用次数：2912 相关文章所有 8 个版本

[PDF] neurips.cc

Two-stage learning to defer with multiple experts

A Mao, C Mohri, M Mohri… - Advances in neural …, 2024 - proceedings.neurips.cc

We study a two-stage scenario for learning to defer with multiple experts, which is crucial in
practice for many applications. In this scenario, a predictor is derived in a first stage by …

被引用次数：31 相关文章所有 7 个版本

[PDF] mlr.press

Does label smoothing mitigate label noise?

M Lukasik, S Bhojanapalli, A Menon… - … on Machine Learning, 2020 - proceedings.mlr.press

Label smoothing is commonly used in training deep learning models, wherein one-hot
training labels are mixed with uniform label vectors. Empirically, smoothing has been shown …

被引用次数：409 相关文章所有 9 个版本

[PDF] arxiv.org

Image classification with deep learning in the presence of noisy labels: A survey

G Algan, I Ulusoy - Knowledge-Based Systems, 2021 - Elsevier

Image classification systems recently made a giant leap with the advancement of deep
neural networks. However, these systems require an excessive amount of labeled data to be …

被引用次数：395 相关文章所有 5 个版本