Neural collapse: A review on modelling principles and generalization

V Kothapalli - arXiv preprint arXiv:2206.04041, 2022 - arxiv.org
Deep classifier neural networks enter the terminal phase of training (TPT) when training
error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural …

Directional convergence and alignment in deep learning

Z Ji, M Telgarsky - Advances in Neural Information …, 2020 - proceedings.neurips.cc
In this paper, we show that although the minimizers of cross-entropy and related
classification losses are off at infinity, network weights learned by gradient flow converge in …

Fantastic generalization measures and where to find them

Y Jiang, B Neyshabur, H Mobahi, D Krishnan… - arXiv preprint arXiv …, 2019 - arxiv.org
Generalization of deep networks has been of great interest in recent years, resulting in a
number of theoretically and empirically motivated complexity measures. However, most …

On the measure of intelligence

F Chollet - arXiv preprint arXiv:1911.01547, 2019 - arxiv.org
To make deliberate progress towards more intelligent and more human-like artificial
systems, we need to be following an appropriate feedback signal: we need to be able to …

The modern mathematics of deep learning

J Berner, P Grohs, G Kutyniok… - arXiv preprint arXiv …, 2021 - cambridge.org
We describe the new field of the mathematical analysis of deep learning. This field emerged
around a list of research questions that were not answered within the classical framework of …

Predicting with confidence on unseen distributions

D Guillory, V Shankar, S Ebrahimi… - Proceedings of the …, 2021 - openaccess.thecvf.com
Recent work has shown that the accuracy of machine learning models can vary substantially
when evaluated on a distribution that even slightly differs from that of the training data. As a …

Network pruning via performance maximization

S Gao, F Huang, W Cai… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Channel pruning is a class of powerful methods for model compression. When pruning a
neural network, it's ideal to obtain a sub-network with higher accuracy. However, a sub …

Exploring the limits of large scale pre-training

S Abnar, M Dehghani, B Neyshabur… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent developments in large-scale machine learning suggest that by scaling up data,
model size and training time properly, one might observe that improvements in pre-training …

Permutation equivariant neural functionals

A Zhou, K Yang, K Burns, A Cardace… - Advances in neural …, 2024 - proceedings.neurips.cc
This work studies the design of neural networks that can process the weights or gradients of
other neural networks, which we refer to as neural functional networks (NFNs). Despite a …

Deep learning through the lens of example difficulty

R Baldock, H Maennel… - Advances in Neural …, 2021 - proceedings.neurips.cc
Existing work on understanding deep learning often employs measures that compress all
data-dependent information into a few numbers. In this work, we adopt a perspective based …