A comprehensive survey on training acceleration for large machine learning models in IoT
The ever-growing artificial intelligence (AI) applications have greatly reshaped our world in
many areas, eg, smart home, computer vision, natural language processing, etc. Behind …
many areas, eg, smart home, computer vision, natural language processing, etc. Behind …
PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization
In this paper, we propose a novel stochastic gradient estimator—ProbAbilistic Gradient
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …
Quantum speedups for stochastic optimization
We consider the problem of minimizing a continuous function given given access to a
natural quantum generalization of a stochastic gradient oracle. We provide two new …
natural quantum generalization of a stochastic gradient oracle. We provide two new …
Distributed learning in non-convex environments—Part I: Agreement at a linear rate
Driven by the need to solve increasingly complex optimization problems in signal
processing and machine learning, there has been increasing interest in understanding the …
processing and machine learning, there has been increasing interest in understanding the …
A hybrid stochastic optimization framework for composite nonconvex optimization
We introduce a new approach to develop stochastic optimization algorithms for a class of
stochastic composite and possibly nonconvex optimization problems. The main idea is to …
stochastic composite and possibly nonconvex optimization problems. The main idea is to …
FedPAGE: A fast local stochastic gradient method for communication-efficient federated learning
Federated Averaging (FedAvg, also known as Local-SGD)(McMahan et al., 2017) is a
classical federated learning algorithm in which clients run multiple local SGD steps before …
classical federated learning algorithm in which clients run multiple local SGD steps before …
A unified analysis of stochastic gradient methods for nonconvex federated optimization
Z Li, P Richtárik - arXiv preprint arXiv:2006.07013, 2020 - arxiv.org
In this paper, we study the performance of a large family of SGD variants in the smooth
nonconvex regime. To this end, we propose a generic and flexible assumption capable of …
nonconvex regime. To this end, we propose a generic and flexible assumption capable of …
Escape saddle points by a simple gradient-descent based algorithm
Escaping saddle points is a central research topic in nonconvex optimization. In this paper,
we propose a simple gradient-based algorithm such that for a smooth function …
we propose a simple gradient-based algorithm such that for a smooth function …
Simple and optimal stochastic gradient methods for nonsmooth nonconvex optimization
We propose and analyze several stochastic gradient algorithms for finding stationary points
or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online …
or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online …
ZeroSARAH: Efficient nonconvex finite-sum optimization with zero full gradient computation
We propose ZeroSARAH--a novel variant of the variance-reduced method SARAH (Nguyen
et al., 2017)--for minimizing the average of a large number of nonconvex functions $\frac …
et al., 2017)--for minimizing the average of a large number of nonconvex functions $\frac …