A comprehensive survey on training acceleration for large machine learning models in IoT

H Wang, Z Qu, Q Zhou, H Zhang, B Luo… - IEEE Internet of …, 2021 - ieeexplore.ieee.org
The ever-growing artificial intelligence (AI) applications have greatly reshaped our world in
many areas, eg, smart home, computer vision, natural language processing, etc. Behind …

PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization

Z Li, H Bao, X Zhang… - … conference on machine …, 2021 - proceedings.mlr.press
In this paper, we propose a novel stochastic gradient estimator—ProbAbilistic Gradient
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …

Quantum speedups for stochastic optimization

A Sidford, C Zhang - Advances in Neural Information …, 2024 - proceedings.neurips.cc
We consider the problem of minimizing a continuous function given given access to a
natural quantum generalization of a stochastic gradient oracle. We provide two new …

Distributed learning in non-convex environments—Part I: Agreement at a linear rate

S Vlaski, AH Sayed - IEEE Transactions on Signal Processing, 2021 - ieeexplore.ieee.org
Driven by the need to solve increasingly complex optimization problems in signal
processing and machine learning, there has been increasing interest in understanding the …

A hybrid stochastic optimization framework for composite nonconvex optimization

Q Tran-Dinh, NH Pham, DT Phan… - Mathematical Programming, 2022 - Springer
We introduce a new approach to develop stochastic optimization algorithms for a class of
stochastic composite and possibly nonconvex optimization problems. The main idea is to …

FedPAGE: A fast local stochastic gradient method for communication-efficient federated learning

H Zhao, Z Li, P Richtárik - arXiv preprint arXiv:2108.04755, 2021 - arxiv.org
Federated Averaging (FedAvg, also known as Local-SGD)(McMahan et al., 2017) is a
classical federated learning algorithm in which clients run multiple local SGD steps before …

A unified analysis of stochastic gradient methods for nonconvex federated optimization

Z Li, P Richtárik - arXiv preprint arXiv:2006.07013, 2020 - arxiv.org
In this paper, we study the performance of a large family of SGD variants in the smooth
nonconvex regime. To this end, we propose a generic and flexible assumption capable of …

Escape saddle points by a simple gradient-descent based algorithm

C Zhang, T Li - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc
Escaping saddle points is a central research topic in nonconvex optimization. In this paper,
we propose a simple gradient-based algorithm such that for a smooth function …

Simple and optimal stochastic gradient methods for nonsmooth nonconvex optimization

Z Li, J Li - Journal of Machine Learning Research, 2022 - jmlr.org
We propose and analyze several stochastic gradient algorithms for finding stationary points
or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online …

ZeroSARAH: Efficient nonconvex finite-sum optimization with zero full gradient computation

Z Li, S Hanzely, P Richtárik - arXiv preprint arXiv:2103.01447, 2021 - arxiv.org
We propose ZeroSARAH--a novel variant of the variance-reduced method SARAH (Nguyen
et al., 2017)--for minimizing the average of a large number of nonconvex functions $\frac …