Byzantine machine learning: A primer

R Guerraoui, N Gupta, R Pinot - ACM Computing Surveys, 2024 - dl.acm.org
The problem of Byzantine resilience in distributed machine learning, aka Byzantine machine
learning, consists of designing distributed algorithms that can train an accurate model …

Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top

E Gorbunov, S Horváth, P Richtárik, G Gidel - arXiv preprint arXiv …, 2022 - arxiv.org
Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in
collaborative and federated learning. However, many fruitful directions, such as the usage of …

Byzantine-resilient decentralized stochastic optimization with robust aggregation rules

Z Wu, T Chen, Q Ling - IEEE transactions on signal processing, 2023 - ieeexplore.ieee.org
This article focuses on decentralized stochastic optimization in the presence of Byzantine
attacks. During the optimization process, an unknown number of malfunctioning or malicious …

Communication compression for byzantine robust learning: New efficient algorithms and improved rates

A Rammal, K Gruntkowska, N Fedin… - International …, 2024 - proceedings.mlr.press
Byzantine robustness is an essential feature of algorithms for certain distributed optimization
problems, typically encountered in collaborative/federated learning. These problems are …

Robust collaborative learning with linear gradient overhead

S Farhadkhani, R Guerraoui, N Gupta… - International …, 2023 - proceedings.mlr.press
Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty
machines that may deviate from their prescribed algorithm because of software or hardware …

Byzantine robustness and partial participation can be achieved simultaneously: Just clip gradient differences

G Malinovsky, E Gorbunov, S Horváth… - Privacy Regulation and …, 2023 - openreview.net
Distributed learning has emerged as a leading paradigm for training large machine learning
models. However, in real-world scenarios, participants may be unreliable or malicious …

Training transformers together

A Borzunov, M Ryabinin, T Dettmers… - NeurIPS 2021 …, 2022 - proceedings.mlr.press
The infrastructure necessary for training state-of-the-art models is becoming overly
expensive, which makes training such models affordable only to large corporations and …

Secure distributed optimization under gradient attacks

S Yu, S Kar - IEEE Transactions on Signal Processing, 2023 - ieeexplore.ieee.org
In this article, we study secure distributed optimization against arbitrary gradient attacks in
multi-agent networks. In distributed optimization, there is no central server to coordinate …

Partially personalized federated learning: Breaking the curse of data heterogeneity

K Mishchenko, R Islamov, E Gorbunov… - arXiv preprint arXiv …, 2023 - arxiv.org
We present a partially personalized formulation of Federated Learning (FL) that strikes a
balance between the flexibility of personalization and cooperativeness of global training. In …

Byzantine-tolerant methods for distributed variational inequalities

N Tupitsa, AJ Almansoori, Y Wu… - Advances in …, 2024 - proceedings.neurips.cc
Robustness to Byzantine attacks is a necessity for various distributed training scenarios.
When the training reduces to the process of solving a minimization problem, Byzantine …