Byzantine machine learning: A primer
The problem of Byzantine resilience in distributed machine learning, aka Byzantine machine
learning, consists of designing distributed algorithms that can train an accurate model …
learning, consists of designing distributed algorithms that can train an accurate model …
Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top
Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in
collaborative and federated learning. However, many fruitful directions, such as the usage of …
collaborative and federated learning. However, many fruitful directions, such as the usage of …
Byzantine-resilient decentralized stochastic optimization with robust aggregation rules
This article focuses on decentralized stochastic optimization in the presence of Byzantine
attacks. During the optimization process, an unknown number of malfunctioning or malicious …
attacks. During the optimization process, an unknown number of malfunctioning or malicious …
Communication compression for byzantine robust learning: New efficient algorithms and improved rates
A Rammal, K Gruntkowska, N Fedin… - International …, 2024 - proceedings.mlr.press
Byzantine robustness is an essential feature of algorithms for certain distributed optimization
problems, typically encountered in collaborative/federated learning. These problems are …
problems, typically encountered in collaborative/federated learning. These problems are …
Robust collaborative learning with linear gradient overhead
Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty
machines that may deviate from their prescribed algorithm because of software or hardware …
machines that may deviate from their prescribed algorithm because of software or hardware …
Byzantine robustness and partial participation can be achieved simultaneously: Just clip gradient differences
Distributed learning has emerged as a leading paradigm for training large machine learning
models. However, in real-world scenarios, participants may be unreliable or malicious …
models. However, in real-world scenarios, participants may be unreliable or malicious …
Training transformers together
The infrastructure necessary for training state-of-the-art models is becoming overly
expensive, which makes training such models affordable only to large corporations and …
expensive, which makes training such models affordable only to large corporations and …
Secure distributed optimization under gradient attacks
S Yu, S Kar - IEEE Transactions on Signal Processing, 2023 - ieeexplore.ieee.org
In this article, we study secure distributed optimization against arbitrary gradient attacks in
multi-agent networks. In distributed optimization, there is no central server to coordinate …
multi-agent networks. In distributed optimization, there is no central server to coordinate …
Partially personalized federated learning: Breaking the curse of data heterogeneity
We present a partially personalized formulation of Federated Learning (FL) that strikes a
balance between the flexibility of personalization and cooperativeness of global training. In …
balance between the flexibility of personalization and cooperativeness of global training. In …
Byzantine-tolerant methods for distributed variational inequalities
N Tupitsa, AJ Almansoori, Y Wu… - Advances in …, 2024 - proceedings.neurips.cc
Robustness to Byzantine attacks is a necessity for various distributed training scenarios.
When the training reduces to the process of solving a minimization problem, Byzantine …
When the training reduces to the process of solving a minimization problem, Byzantine …