Quantum variational algorithms are swamped with traps

ER Anschuetz, BT Kiani - Nature Communications, 2022 - nature.com
One of the most important properties of classical neural networks is how surprisingly
trainable they are, though their training algorithms typically rely on optimizing complicated …

Smoothing the landscape boosts the signal for sgd: Optimal sample complexity for learning single index models

A Damian, E Nichani, R Ge… - Advances in Neural …, 2024 - proceedings.neurips.cc
We focus on the task of learning a single index model $\sigma (w^\star\cdot x) $ with respect
to the isotropic Gaussian distribution in $ d $ dimensions. Prior work has shown that the …

Machine un-learning: an overview of techniques, applications, and future directions

S Sai, U Mittal, V Chamola, K Huang, I Spinelli… - Cognitive …, 2024 - Springer
ML applications proliferate across various sectors. Large internet firms employ ML to train
intelligent models using vast datasets, including sensitive user information. However, new …

Statistical algorithms and a lower bound for detecting planted cliques

V Feldman, E Grigorescu, L Reyzin… - Journal of the ACM …, 2017 - dl.acm.org
We introduce a framework for proving lower bounds on computational problems over
distributions against algorithms that can be implemented using access to a statistical query …

Superpolynomial lower bounds for learning one-layer neural networks using gradient descent

S Goel, A Gollakota, Z Jin… - International …, 2020 - proceedings.mlr.press
We give the first superpolynomial lower bounds for learning one-layer neural networks with
respect to the Gaussian distribution for a broad class of algorithms. In the regression setting …

Near-optimal sq lower bounds for agnostically learning halfspaces and relus under gaussian marginals

I Diakonikolas, D Kane, N Zarifis - Advances in Neural …, 2020 - proceedings.neurips.cc
We study the fundamental problems of agnostically learning halfspaces and ReLUs under
Gaussian marginals. In the former problem, given labeled examples $(\bx, y) $ from an …

The optimality of polynomial regression for agnostic learning under gaussian marginals in the SQ model

I Diakonikolas, DM Kane, T Pittas… - … on Learning Theory, 2021 - proceedings.mlr.press
We study the problem of agnostic learning under the Gaussian distribution in the Statistical
Query (SQ) model. We develop a method for finding hard families of examples for a wide …

Algorithms and sq lower bounds for pac learning one-hidden-layer relu networks

I Diakonikolas, DM Kane… - … on Learning Theory, 2020 - proceedings.mlr.press
We study the problem of PAC learning one-hidden-layer ReLU networks with $ k $ hidden
units on $\mathbb {R}^ d $ under Gaussian marginals in the presence of additive label …

Time/accuracy tradeoffs for learning a relu with respect to gaussian marginals

S Goel, S Karmalkar, A Klivans - Advances in neural …, 2019 - proceedings.neurips.cc
We consider the problem of computing the best-fitting ReLU with respect to square-loss on a
training set when the examples have been drawn according to a spherical Gaussian …

Provably learning a multi-head attention layer

S Chen, Y Li - arXiv preprint arXiv:2402.04084, 2024 - arxiv.org
The multi-head attention layer is one of the key components of the transformer architecture
that sets it apart from traditional feed-forward models. Given a sequence length $ k …