Are Gaussian data all you need? The extents and limits of universality in high-dimensional generalized linear estimation
In this manuscript we consider the problem of generalized linear estimation on Gaussian
mixture data with labels given by a single-index model. Our first result is a sharp asymptotic …
mixture data with labels given by a single-index model. Our first result is a sharp asymptotic …
[HTML][HTML] Wigner kernels: body-ordered equivariant machine learning without a basis
F Bigi, SN Pozdnyakov, M Ceriotti - The Journal of Chemical Physics, 2024 - pubs.aip.org
Machine-learning models based on a point-cloud representation of a physical object are
ubiquitous in scientific applications and particularly well-suited to the atomic-scale …
ubiquitous in scientific applications and particularly well-suited to the atomic-scale …
On double-descent in uncertainty quantification in overparametrized models
Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …
learning. Naive measures such as last-layer scores are well-known to yield overconfident …
Multinomial logistic regression: Asymptotic normality on null covariates in high-dimensions
This paper investigates the asymptotic distribution of the maximum-likelihood estimate
(MLE) in multinomial logistic models in the high-dimensional regime where dimension and …
(MLE) in multinomial logistic models in the high-dimensional regime where dimension and …
A phase transition between positional and semantic learning in a solvable model of dot-product attention
We investigate how a dot-product attention layer learns a positional attention matrix (with
tokens attending to each other based on their respective positions) and a semantic attention …
tokens attending to each other based on their respective positions) and a semantic attention …
Precise asymptotic generalization for multiclass classification with overparameterized linear models
We study the asymptotic generalization of an overparameterized linear model for multiclass
classification under the Gaussian covariates bi-level model introduced in Subramanian et …
classification under the Gaussian covariates bi-level model introduced in Subramanian et …
Phase transitions in the mini-batch size for sparse and dense two-layer neural networks
R Marino, F Ricci-Tersenghi - Machine Learning: Science and …, 2024 - iopscience.iop.org
The use of mini-batches of data in training artificial neural networks is nowadays very
common. Despite its broad usage, theories explaining quantitatively how large or small the …
common. Despite its broad usage, theories explaining quantitatively how large or small the …
Implicit bias of next-token prediction
C Thrampoulidis - arXiv preprint arXiv:2402.18551, 2024 - arxiv.org
Next-token prediction (NTP), the go-to training paradigm in training large language models,
involves predicting the next token in a sequence. Departing from traditional one-hot …
involves predicting the next token in a sequence. Departing from traditional one-hot …
A convergence analysis of approximate message passing with non-separable functions and applications to multi-class classification
Motivated by the recent application of approximate message passing (AMP) to the analysis
of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a …
of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a …
A study of uncertainty quantification in overparametrized high-dimensional models
Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …
learning. Naive measures such as last-layer scores are well-known to yield overconfident …