A survey of end-to-end driving: Architectures and training methods

A Tampuu, T Matiisen, M Semikin… - … on Neural Networks …, 2020 - ieeexplore.ieee.org
Autonomous driving is of great interest to industry and academia alike. The use of machine
learning approaches for autonomous driving has long been studied, but mostly in the …

Deep double descent: Where bigger models and more data hurt

P Nakkiran, G Kaplun, Y Bansal, T Yang… - Journal of Statistical …, 2021 - iopscience.iop.org
We show that a variety of modern deep learning tasks exhibit a'double-
descent'phenomenon where, as we increase model size, performance first gets worse and …

Reconciling modern machine-learning practice and the classical bias–variance trade-off

M Belkin, D Hsu, S Ma… - Proceedings of the …, 2019 - National Acad Sciences
Breakthroughs in machine learning are rapidly changing science and society, yet our
fundamental understanding of this technology has lagged far behind. Indeed, one of the …

From stars to subgraphs: Uplifting any GNN with local structure awareness

L Zhao, W Jin, L Akoglu, N Shah - arXiv preprint arXiv:2110.03753, 2021 - arxiv.org
Message Passing Neural Networks (MPNNs) are a common type of Graph Neural Network
(GNN), in which each node's representation is computed recursively by aggregating …

Exploring the limitations of behavior cloning for autonomous driving

F Codevilla, E Santana, AM López… - Proceedings of the …, 2019 - openaccess.thecvf.com
Driving requires reacting to a wide variety of complex environment conditions and agent
behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation …

[HTML][HTML] Landscape and training regimes in deep learning

M Geiger, L Petrini, M Wyart - Physics Reports, 2021 - Elsevier
Deep learning algorithms are responsible for a technological revolution in a variety of tasks
including image recognition or Go playing. Yet, why they work is not understood. Ultimately …

Two models of double descent for weak features

M Belkin, D Hsu, J Xu - SIAM Journal on Mathematics of Data Science, 2020 - SIAM
The “double descent” risk curve was proposed to qualitatively describe the out-of-sample
prediction accuracy of variably parameterized machine learning models. This article …

Assessing generalization of SGD via disagreement

Y Jiang, V Nagarajan, C Baek, JZ Kolter - arXiv preprint arXiv:2106.13799, 2021 - arxiv.org
We empirically show that the test error of deep networks can be estimated by simply training
the same architecture on the same training set but with a different run of Stochastic Gradient …

Rethinking soft labels for knowledge distillation: A bias-variance tradeoff perspective

H Zhou, L Song, J Chen, Y Zhou, G Wang… - arXiv preprint arXiv …, 2021 - arxiv.org
Knowledge distillation is an effective approach to leverage a well-trained network or an
ensemble of them, named as the teacher, to guide the training of a student network. The …

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press
We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …