From high-dimensional & mean-field dynamics to dimensionless odes: A unifying approach to sgd in two-layers networks L Arnaboldi, L Stephan, F Krzakala, B Loureiro The Thirty Sixth Annual Conference on Learning Theory, 1199-1227, 2023 | 22 | 2023 |
The benefits of reusing batches for gradient descent in two-layer networks: Breaking the curse of information and leap exponents Y Dandi, E Troiani, L Arnaboldi, L Pesce, L Zdeborová, F Krzakala arXiv preprint arXiv:2402.03220, 2024 | 12 | 2024 |
Escaping mediocrity: how two-layer networks learn hard generalized linear models L Arnaboldi, F Krzakala, B Loureiro, L Stephan OPT 2023: Optimization for Machine Learning, 2023 | 8* | 2023 |
Repetita iuvant: Data repetition allows sgd to learn high-dimensional multi-index functions L Arnaboldi, Y Dandi, F Krzakala, L Pesce, L Stephan arXiv preprint arXiv:2405.15459, 2024 | 3 | 2024 |
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs L Arnaboldi, Y Dandi, F Krzakala, B Loureiro, L Pesce, L Stephan arXiv preprint arXiv:2406.02157, 2024 | 1 | 2024 |
Learning transitions for one-pass stochastic gradient descent on shallow neural networks L ARNABOLDI | | 2022 |