Are Gaussian data all you need? The extents and limits of universality in high-dimensional generalized linear estimation

L Pesce, F Krzakala, B Loureiro… - … on Machine Learning, 2023 - proceedings.mlr.press
In this manuscript we consider the problem of generalized linear estimation on Gaussian
mixture data with labels given by a single-index model. Our first result is a sharp asymptotic …

[HTML][HTML] Wigner kernels: body-ordered equivariant machine learning without a basis

F Bigi, SN Pozdnyakov, M Ceriotti - The Journal of Chemical Physics, 2024 - pubs.aip.org
Machine-learning models based on a point-cloud representation of a physical object are
ubiquitous in scientific applications and particularly well-suited to the atomic-scale …

On double-descent in uncertainty quantification in overparametrized models

L Clarté, B Loureiro, F Krzakala… - International …, 2023 - proceedings.mlr.press
Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …

Multinomial logistic regression: Asymptotic normality on null covariates in high-dimensions

K Tan, PC Bellec - Advances in Neural Information …, 2024 - proceedings.neurips.cc
This paper investigates the asymptotic distribution of the maximum-likelihood estimate
(MLE) in multinomial logistic models in the high-dimensional regime where dimension and …

A phase transition between positional and semantic learning in a solvable model of dot-product attention

H Cui, F Behrens, F Krzakala, L Zdeborová - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate how a dot-product attention layer learns a positional attention matrix (with
tokens attending to each other based on their respective positions) and a semantic attention …

Precise asymptotic generalization for multiclass classification with overparameterized linear models

D Wu, A Sahai - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
We study the asymptotic generalization of an overparameterized linear model for multiclass
classification under the Gaussian covariates bi-level model introduced in Subramanian et …

Phase transitions in the mini-batch size for sparse and dense two-layer neural networks

R Marino, F Ricci-Tersenghi - Machine Learning: Science and …, 2024 - iopscience.iop.org
The use of mini-batches of data in training artificial neural networks is nowadays very
common. Despite its broad usage, theories explaining quantitatively how large or small the …

Implicit bias of next-token prediction

C Thrampoulidis - arXiv preprint arXiv:2402.18551, 2024 - arxiv.org
Next-token prediction (NTP), the go-to training paradigm in training large language models,
involves predicting the next token in a sequence. Departing from traditional one-hot …

A convergence analysis of approximate message passing with non-separable functions and applications to multi-class classification

B Çakmak, YM Lu, M Opper - arXiv preprint arXiv:2402.08676, 2024 - arxiv.org
Motivated by the recent application of approximate message passing (AMP) to the analysis
of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a …

A study of uncertainty quantification in overparametrized high-dimensional models

Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …