A unified variance-reduced accelerated gradient method for convex optimization

Z Li, H Bao, X Zhang… - … conference on machine …, 2021 - proceedings.mlr.press

In this paper, we propose a novel stochastic gradient estimator—ProbAbilistic Gradient
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …

被引用次数：139 相关文章所有 15 个版本

[图书][B] First-order and stochastic optimization methods for machine learning

G Lan - 2020 - Springer

Since its beginning, optimization has played a vital role in data science. The analysis and
solution methods for many statistical and machine learning models rely on optimization. The …

被引用次数：444 相关文章所有 7 个版本

[PDF] arxiv.org

Acceleration for compressed gradient descent in distributed and federated optimization

Z Li, D Kovalev, X Qian, P Richtárik - arXiv preprint arXiv:2002.11364, 2020 - arxiv.org

Due to the high communication cost in distributed and federated learning problems,
methods relying on compression of communicated messages are becoming increasingly …

被引用次数：165 相关文章所有 15 个版本

[HTML] nih.gov

Convex optimization algorithms in medical image reconstruction—in the age of AI

J Xu, F Noo - Physics in Medicine & Biology, 2022 - iopscience.iop.org

The past decade has seen the rapid growth of model based image reconstruction (MBIR)
algorithms, which are often applications or adaptations of convex optimization algorithms …

被引用次数：18 相关文章所有 7 个版本

[PDF] arxiv.org

Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top

E Gorbunov, S Horváth, P Richtárik, G Gidel - arXiv preprint arXiv …, 2022 - arxiv.org

Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in
collaborative and federated learning. However, many fruitful directions, such as the usage of …

被引用次数：44 相关文章所有 7 个版本

[PDF] mlr.press

Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods

Y Jin, A Sidford, K Tian - Conference on Learning Theory, 2022 - proceedings.mlr.press

We design accelerated algorithms with improved rates for several fundamental classes of
optimization problems. Our algorithms all build upon techniques related to the analysis of …

被引用次数：39 相关文章所有 4 个版本

[PDF] arxiv.org

EF21 with bells & whistles: Practical algorithmic extensions of modern error feedback

I Fatkhullin, I Sokolov, E Gorbunov, Z Li… - arXiv preprint arXiv …, 2021 - arxiv.org

First proposed by Seide (2014) as a heuristic, error feedback (EF) is a very popular
mechanism for enforcing convergence of distributed gradient-based optimization methods …

被引用次数：52 相关文章所有 6 个版本

[PDF] neurips.cc

Adaptive stochastic variance reduction for non-convex finite-sum minimization

A Kavis, S Skoulakis… - Advances in …, 2022 - proceedings.neurips.cc

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $
L $-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider …

被引用次数：13 相关文章所有 11 个版本

[PDF] neurips.cc

CANITA: Faster rates for distributed convex optimization with communication compression

Z Li, P Richtárik - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Due to the high communication cost in distributed and federated learning, methods relying
on compressed communication are becoming increasingly popular. Besides, the best …

被引用次数：30 相关文章所有 13 个版本

[PDF] arxiv.org

FedPAGE: A fast local stochastic gradient method for communication-efficient federated learning

H Zhao, Z Li, P Richtárik - arXiv preprint arXiv:2108.04755, 2021 - arxiv.org

Federated Averaging (FedAvg, also known as Local-SGD)(McMahan et al., 2017) is a
classical federated learning algorithm in which clients run multiple local SGD steps before …

被引用次数：31 相关文章所有 7 个版本