Competing with the empirical risk minimizer in a single pass

RM Gower, M Schmidt, F Bach… - Proceedings of the …, 2020 - ieeexplore.ieee.org

Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …

被引用次数：117 相关文章所有 14 个版本

[PDF] uwo.ca

Deep learning for load forecasting with smart meter data: Online Adaptive Recurrent Neural Network

MN Fekri, H Patel, K Grolinger, V Sharma - Applied Energy, 2021 - Elsevier

Electricity load forecasting has been attracting research and industry attention because of its
importance for energy management, infrastructure planning, and budgeting. In recent years …

被引用次数：245 相关文章所有 10 个版本

[PDF] neurips.cc

Making ai forget you: Data deletion in machine learning

A Ginart, M Guan, G Valiant… - Advances in neural …, 2019 - proceedings.neurips.cc

Intense recent discussions have focused on how to provide individuals with control over
when their data can and cannot be used---the EU's Right To Be Forgotten regulation is an …

被引用次数：374 相关文章所有 14 个版本

[PDF] mlr.press

Train faster, generalize better: Stability of stochastic gradient descent

M Hardt, B Recht, Y Singer - International conference on …, 2016 - proceedings.mlr.press

We show that parametric models trained by a stochastic gradient method (SGM) with few
iterations have vanishing generalization error. We prove our results by arguing that SGM is …

被引用次数：1351 相关文章所有 7 个版本

[PDF] arxiv.org

Straggler-resilient federated learning: Leveraging the interplay between statistical accuracy and system heterogeneity

A Reisizadeh, I Tziotis, H Hassani… - IEEE Journal on …, 2022 - ieeexplore.ieee.org

Federated learning is a novel paradigm that involves learning from data samples distributed
across a large network of clients while the data remains local. It is, however, known that …

被引用次数：98 相关文章所有 7 个版本

[PDF] mlr.press

A linearly-convergent stochastic L-BFGS algorithm

P Moritz, R Nishihara, M Jordan - Artificial Intelligence and …, 2016 - proceedings.mlr.press

We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for
strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic …

被引用次数：295 相关文章所有 8 个版本

[PDF] arxiv.org

On analog gradient descent learning over multiple access fading channels

T Sery, K Cohen - IEEE Transactions on Signal Processing, 2020 - ieeexplore.ieee.org

We consider a distributed learning problem over multiple access channel (MAC) using a
large wireless network. The computation is made by the network edge and is based on …

被引用次数：144 相关文章所有 5 个版本

[PDF] neurips.cc

The step decay schedule: A near optimal, geometrically decaying learning rate procedure for least squares

R Ge, SM Kakade, R Kidambi… - Advances in neural …, 2019 - proceedings.neurips.cc

Minimax optimal convergence rates for numerous classes of stochastic convex optimization
problems are well characterized, where the majority of results utilize iterate averaged …

被引用次数：171 相关文章所有 12 个版本

[PDF] mlr.press

The heavy-tail phenomenon in SGD

M Gurbuzbalaban, U Simsekli… - … Conference on Machine …, 2021 - proceedings.mlr.press

In recent years, various notions of capacity and complexity have been proposed for
characterizing the generalization properties of stochastic gradient descent (SGD) in deep …

被引用次数：112 相关文章所有 10 个版本

[PDF] jmlr.org

ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization

NH Pham, LM Nguyen, DT Phan… - Journal of Machine …, 2020 - jmlr.org

We propose a new stochastic first-order algorithmic framework to solve stochastic composite
nonconvex optimization problems that covers both finite-sum and expectation settings. Our …

被引用次数：142 相关文章所有 12 个版本