A modern take on the bias-variance tradeoff in neural networks

A Tampuu, T Matiisen, M Semikin… - … on Neural Networks …, 2020 - ieeexplore.ieee.org

Autonomous driving is of great interest to industry and academia alike. The use of machine
learning approaches for autonomous driving has long been studied, but mostly in the …

被引用次数：248 相关文章所有 6 个版本

[PDF] arxiv.org

Deep double descent: Where bigger models and more data hurt

P Nakkiran, G Kaplun, Y Bansal, T Yang… - Journal of Statistical …, 2021 - iopscience.iop.org

We show that a variety of modern deep learning tasks exhibit a'double-
descent'phenomenon where, as we increase model size, performance first gets worse and …

被引用次数：1063 相关文章所有 10 个版本

[PDF] pnas.org Full View

Reconciling modern machine-learning practice and the classical bias–variance trade-off

M Belkin, D Hsu, S Ma… - Proceedings of the …, 2019 - National Acad Sciences

Breakthroughs in machine learning are rapidly changing science and society, yet our
fundamental understanding of this technology has lagged far behind. Indeed, one of the …

被引用次数：2107 相关文章所有 13 个版本

[PDF] openreview.net

From stars to subgraphs: Uplifting any GNN with local structure awareness

L Zhao, W Jin, L Akoglu, N Shah - arXiv preprint arXiv:2110.03753, 2021 - arxiv.org

Message Passing Neural Networks (MPNNs) are a common type of Graph Neural Network
(GNN), in which each node's representation is computed recursively by aggregating …

被引用次数：162 相关文章所有 3 个版本

[PDF] thecvf.com

Exploring the limitations of behavior cloning for autonomous driving

F Codevilla, E Santana, AM López… - Proceedings of the …, 2019 - openaccess.thecvf.com

Driving requires reacting to a wide variety of complex environment conditions and agent
behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation …

被引用次数：602 相关文章所有 14 个版本

[HTML] sciencedirect.com

[HTML][HTML] Landscape and training regimes in deep learning

M Geiger, L Petrini, M Wyart - Physics Reports, 2021 - Elsevier

Deep learning algorithms are responsible for a technological revolution in a variety of tasks
including image recognition or Go playing. Yet, why they work is not understood. Ultimately …

被引用次数：42 相关文章所有 6 个版本

[PDF] siam.org

Two models of double descent for weak features

M Belkin, D Hsu, J Xu - SIAM Journal on Mathematics of Data Science, 2020 - SIAM

The “double descent” risk curve was proposed to qualitatively describe the out-of-sample
prediction accuracy of variably parameterized machine learning models. This article …

被引用次数：452 相关文章所有 5 个版本

[PDF] arxiv.org

Assessing generalization of SGD via disagreement

Y Jiang, V Nagarajan, C Baek, JZ Kolter - arXiv preprint arXiv:2106.13799, 2021 - arxiv.org

We empirically show that the test error of deep networks can be estimated by simply training
the same architecture on the same training set but with a different run of Stochastic Gradient …

被引用次数：120 相关文章所有 5 个版本

[PDF] nsf.gov

Rethinking soft labels for knowledge distillation: A bias-variance tradeoff perspective

H Zhou, L Song, J Chen, Y Zhou, G Wang… - arXiv preprint arXiv …, 2021 - arxiv.org

Knowledge distillation is an effective approach to leverage a well-trained network or an
ensemble of them, named as the teacher, to guide the training of a student network. The …

被引用次数：162 相关文章所有 5 个版本

[PDF] mlr.press

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press

We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

被引用次数：177 相关文章所有 15 个版本