Learning curves for the multi-class teacher–student perceptron

Are Gaussian data all you need? The extents and limits of universality in high-dimensional generalized linear estimation

L Pesce, F Krzakala, B Loureiro… - … on Machine Learning, 2023 - proceedings.mlr.press

In this manuscript we consider the problem of generalized linear estimation on Gaussian
mixture data with labels given by a single-index model. Our first result is a sharp asymptotic …

被引用次数：16 相关文章所有 8 个版本

[HTML] aip.org

[HTML][HTML] Wigner kernels: body-ordered equivariant machine learning without a basis

F Bigi, SN Pozdnyakov, M Ceriotti - The Journal of Chemical Physics, 2024 - pubs.aip.org

Machine-learning models based on a point-cloud representation of a physical object are
ubiquitous in scientific applications and particularly well-suited to the atomic-scale …

被引用次数：13 相关文章所有 4 个版本

[PDF] mlr.press

On double-descent in uncertainty quantification in overparametrized models

L Clarté, B Loureiro, F Krzakala… - International …, 2023 - proceedings.mlr.press

Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …

被引用次数：11 相关文章所有 6 个版本

[PDF] neurips.cc

Multinomial logistic regression: Asymptotic normality on null covariates in high-dimensions

K Tan, PC Bellec - Advances in Neural Information …, 2024 - proceedings.neurips.cc

This paper investigates the asymptotic distribution of the maximum-likelihood estimate
(MLE) in multinomial logistic models in the high-dimensional regime where dimension and …

被引用次数：6 相关文章所有 12 个版本

[PDF] arxiv.org

A phase transition between positional and semantic learning in a solvable model of dot-product attention

H Cui, F Behrens, F Krzakala, L Zdeborová - arXiv preprint arXiv …, 2024 - arxiv.org

We investigate how a dot-product attention layer learns a positional attention matrix (with
tokens attending to each other based on their respective positions) and a semantic attention …

被引用次数：4 相关文章所有 3 个版本

[PDF] neurips.cc

Precise asymptotic generalization for multiclass classification with overparameterized linear models

D Wu, A Sahai - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc

We study the asymptotic generalization of an overparameterized linear model for multiclass
classification under the Gaussian covariates bi-level model introduced in Subramanian et …

被引用次数：2 相关文章所有 5 个版本

[PDF] iop.org Full View

Phase transitions in the mini-batch size for sparse and dense two-layer neural networks

R Marino, F Ricci-Tersenghi - Machine Learning: Science and …, 2024 - iopscience.iop.org

The use of mini-batches of data in training artificial neural networks is nowadays very
common. Despite its broad usage, theories explaining quantitatively how large or small the …

被引用次数：6 相关文章所有 8 个版本

[PDF] arxiv.org

Implicit bias of next-token prediction

C Thrampoulidis - arXiv preprint arXiv:2402.18551, 2024 - arxiv.org

Next-token prediction (NTP), the go-to training paradigm in training large language models,
involves predicting the next token in a sequence. Departing from traditional one-hot …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

A convergence analysis of approximate message passing with non-separable functions and applications to multi-class classification

B Çakmak, YM Lu, M Opper - arXiv preprint arXiv:2402.08676, 2024 - arxiv.org

Motivated by the recent application of approximate message passing (AMP) to the analysis
of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a …

被引用次数：2 相关文章所有 2 个版本

[PDF] hal.science

A study of uncertainty quantification in overparametrized high-dimensional models

L Clarté, B Loureiro, F Krzakala, L Zdeborová - 2022 - hal.science

Uncertainty quantification is a central challenge in reliable and trustworthy machine
learning. Naive measures such as last-layer scores are well-known to yield overconfident …

被引用次数：5 相关文章所有 2 个版本