How two-layer neural networks learn, one (giant) step at a time Y Dandi, F Krzakala, B Loureiro, L Pesce, L Stephan arXiv preprint arXiv:2305.18270, 2023 | 28 | 2023 |
Are Gaussian data all you need? The extents and limits of universality in high-dimensional generalized linear estimation L Pesce, F Krzakala, B Loureiro, L Stephan International Conference on Machine Learning, 27680-27708, 2023 | 16 | 2023 |
The benefits of reusing batches for gradient descent in two-layer networks: Breaking the curse of information and leap exponents Y Dandi, E Troiani, L Arnaboldi, L Pesce, L Zdeborová, F Krzakala arXiv preprint arXiv:2402.03220, 2024 | 13 | 2024 |
Asymptotics of feature learning in two-layer networks after one gradient-step H Cui, L Pesce, Y Dandi, F Krzakala, YM Lu, L Zdeborová, B Loureiro arXiv preprint arXiv:2402.04980, 2024 | 7 | 2024 |
Repetita iuvant: Data repetition allows sgd to learn high-dimensional multi-index functions L Arnaboldi, Y Dandi, F Krzakala, L Pesce, L Stephan arXiv preprint arXiv:2405.15459, 2024 | 3 | 2024 |
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs L Arnaboldi, Y Dandi, F Krzakala, B Loureiro, L Pesce, L Stephan arXiv preprint arXiv:2406.02157, 2024 | 1 | 2024 |
Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap L Pesce, B Loureiro, F Krzakala, L Zdeborová Advances in Neural Information Processing Systems 35, 27087-27099, 2022 | 1 | 2022 |
Theory and applications of the Sum-Of-Squares technique F Bach, E Cornacchia, L Pesce, G Piccioli arXiv preprint arXiv:2306.16255, 2023 | | 2023 |