A statistical mechanics framework for Bayesian deep neural networks beyond the infinite-width limit
Despite the practical success of deep neural networks, a comprehensive theoretical
framework that can predict practically relevant scores, such as the test accuracy, from …
framework that can predict practically relevant scores, such as the test accuracy, from …
On the stepwise nature of self-supervised learning
We present a simple picture of the training process of self-supervised learning methods with
dual deep networks. In our picture, these methods learn their high-dimensional embeddings …
dual deep networks. In our picture, these methods learn their high-dimensional embeddings …
Feature-learning networks are consistent across widths at realistic scales
We study the effect of width on the dynamics of feature-learning neural networks across a
variety of architectures and datasets. Early in training, wide neural networks trained on …
variety of architectures and datasets. Early in training, wide neural networks trained on …
Mechanism for feature learning in neural networks and backpropagation-free machine learning models
Understanding how neural networks learn features, or relevant patterns in data, for
prediction is necessary for their reliable use in technological and scientific applications. In …
prediction is necessary for their reliable use in technological and scientific applications. In …
A spectral condition for feature learning
The push to train ever larger neural networks has motivated the study of initialization and
training at large network width. A key challenge is to scale training so that a network's …
training at large network width. A key challenge is to scale training so that a network's …
Mechanism of feature learning in convolutional neural networks
Understanding the mechanism of how convolutional neural networks learn features from
image data is a fundamental problem in machine learning and computer vision. In this work …
image data is a fundamental problem in machine learning and computer vision. In this work …
A dynamical model of neural scaling laws
On a variety of tasks, the performance of neural networks predictably improves with training
time, dataset size and model size across many orders of magnitude. This phenomenon is …
time, dataset size and model size across many orders of magnitude. This phenomenon is …
Generalization Ability of Wide Neural Networks on
We perform a study on the generalization ability of the wide two-layer ReLU neural network
on $\mathbb {R} $. We first establish some spectral properties of the neural tangent kernel …
on $\mathbb {R} $. We first establish some spectral properties of the neural tangent kernel …
A mathematical theory of relational generalization in transitive inference
Humans and animals routinely infer relations between different items or events and
generalize these relations to novel combinations of items. This allows them to respond …
generalize these relations to novel combinations of items. This allows them to respond …
Statistical mechanics of deep learning beyond the infinite-width limit
Decades-long literature testifies to the success of statistical mechanics at clarifying
fundamental aspects of deep learning. Yet the ultimate goal remains elusive: we lack a …
fundamental aspects of deep learning. Yet the ultimate goal remains elusive: we lack a …