MARINA: Faster non-convex distributed learning with compression

R Guerraoui, N Gupta, R Pinot - ACM Computing Surveys, 2024 - dl.acm.org

The problem of Byzantine resilience in distributed machine learning, aka Byzantine machine
learning, consists of designing distributed algorithms that can train an accurate model …

被引用次数：21 相关文章所有 3 个版本

[PDF] neurips.cc

EF21: A new, simpler, theoretically better, and practically faster error feedback

P Richtárik, I Sokolov… - Advances in Neural …, 2021 - proceedings.neurips.cc

Error feedback (EF), also known as error compensation, is an immensely popular
convergence stabilization mechanism in the context of distributed training of supervised …

被引用次数：125 相关文章所有 10 个版本

Communication compression techniques in distributed deep learning: A survey

Z Wang, M Wen, Y Xu, Y Zhou, JH Wang… - Journal of Systems …, 2023 - Elsevier

Nowadays, the training data and neural network models are getting increasingly large. The
training time of deep learning will become unbearably long on a single machine. To reduce …

被引用次数：8 相关文章所有 3 个版本

[PDF] neurips.cc

Fast federated learning in the presence of arbitrary device unavailability

X Gu, K Huang, J Zhang… - Advances in Neural …, 2021 - proceedings.neurips.cc

Federated learning (FL) coordinates with numerous heterogeneous devices to
collaboratively train a shared model while preserving user privacy. Despite its multiple …

被引用次数：82 相关文章所有 6 个版本

[PDF] arxiv.org

FedNL: Making Newton-type methods applicable to federated learning

M Safaryan, R Islamov, X Qian, P Richtárik - arXiv preprint arXiv …, 2021 - arxiv.org

Inspired by recent work of Islamov et al (2021), we propose a family of Federated Newton
Learn (FedNL) methods, which we believe is a marked step in the direction of making …

被引用次数：75 相关文章所有 10 个版本

[PDF] neurips.cc

SoteriaFL: A unified framework for private federated learning with communication compression

Z Li, H Zhao, B Li, Y Chi - Advances in Neural Information …, 2022 - proceedings.neurips.cc

To enable large-scale machine learning in bandwidth-hungry environments such as
wireless networks, significant progress has been made recently in designing communication …

被引用次数：32 相关文章所有 10 个版本

[PDF] mlr.press

Stochastic gradient descent-ascent: Unified theory and new efficient methods

A Beznosikov, E Gorbunov… - International …, 2023 - proceedings.mlr.press

Abstract Stochastic Gradient Descent-Ascent (SGDA) is one of the most prominent
algorithms for solving min-max optimization and variational inequalities problems (VIP) …

被引用次数：44 相关文章所有 9 个版本

[PDF] mlr.press

EF21-P and friends: Improved theoretical communication complexity for distributed optimization with bidirectional compression

K Gruntkowska, A Tyurin… - … Conference on Machine …, 2023 - proceedings.mlr.press

In this work we focus our attention on distributed optimization problems in the context where
the communication time between the server and the workers is non-negligible. We obtain …

被引用次数：25 相关文章所有 10 个版本

[PDF] arxiv.org

Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top

E Gorbunov, S Horváth, P Richtárik, G Gidel - arXiv preprint arXiv …, 2022 - arxiv.org

Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in
collaborative and federated learning. However, many fruitful directions, such as the usage of …

被引用次数：39 相关文章所有 7 个版本

[PDF] arxiv.org

Recent theoretical advances in non-convex optimization

M Danilova, P Dvurechensky, A Gasnikov… - … and Probability: With a …, 2022 - Springer

Motivated by recent increased interest in optimization algorithms for non-convex
optimization in application to training deep neural networks and other optimization problems …

被引用次数：79 相关文章所有 11 个版本