A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit

R Pacelli, S Ariosto, M Pastore, F Ginelli… - Nature Machine …, 2023 - nature.com
Despite the practical success of deep neural networks, a comprehensive theoretical
framework that can predict practically relevant scores, such as the test accuracy, from …

On the stepwise nature of self-supervised learning

JB Simon, M Knutins, L Ziyin, D Geisz… - International …, 2023 - proceedings.mlr.press
We present a simple picture of the training process of self-supervised learning methods with
dual deep networks. In our picture, these methods learn their high-dimensional embeddings …

Feature-learning networks are consistent across widths at realistic scales

N Vyas, A Atanasov, B Bordelon… - Advances in …, 2024 - proceedings.neurips.cc
We study the effect of width on the dynamics of feature-learning neural networks across a
variety of architectures and datasets. Early in training, wide neural networks trained on …

Mechanism for feature learning in neural networks and backpropagation-free machine learning models

A Radhakrishnan, D Beaglehole, P Pandit, M Belkin - Science, 2024 - science.org
Understanding how neural networks learn features, or relevant patterns in data, for
prediction is necessary for their reliable use in technological and scientific applications. In …

A spectral condition for feature learning

G Yang, JB Simon, J Bernstein - arXiv preprint arXiv:2310.17813, 2023 - arxiv.org
The push to train ever larger neural networks has motivated the study of initialization and
training at large network width. A key challenge is to scale training so that a network's …

Mechanism of feature learning in convolutional neural networks

D Beaglehole, A Radhakrishnan, P Pandit… - arXiv preprint arXiv …, 2023 - arxiv.org
Understanding the mechanism of how convolutional neural networks learn features from
image data is a fundamental problem in machine learning and computer vision. In this work …

A dynamical model of neural scaling laws

B Bordelon, A Atanasov, C Pehlevan - arXiv preprint arXiv:2402.01092, 2024 - arxiv.org
On a variety of tasks, the performance of neural networks predictably improves with training
time, dataset size and model size across many orders of magnitude. This phenomenon is …

Generalization Ability of Wide Neural Networks on

J Lai, M Xu, R Chen, Q Lin - arXiv preprint arXiv:2302.05933, 2023 - arxiv.org
We perform a study on the generalization ability of the wide two-layer ReLU neural network
on $\mathbb {R} $. We first establish some spectral properties of the neural tangent kernel …

A mathematical theory of relational generalization in transitive inference

S Lippl, K Kay, G Jensen, VP Ferrera… - Proceedings of the …, 2024 - pnas.org
Humans and animals routinely infer relations between different items or events and
generalize these relations to novel combinations of items. This allows them to respond …

Statistical mechanics of deep learning beyond the infinite-width limit

S Ariosto, R Pacelli, M Pastore, F Ginelli… - arXiv preprint arXiv …, 2022 - arxiv.org
Decades-long literature testifies to the success of statistical mechanics at clarifying
fundamental aspects of deep learning. Yet the ultimate goal remains elusive: we lack a …