On distributed stochastic gradient descent for nonconvex functions in the presence of byzantines

Z Allen-Zhu, F Ebrahimian, J Li, D Alistarh - arXiv preprint arXiv …, 2020 - arxiv.org

We study adversary-resilient stochastic distributed optimization, in which $ m $ machines
can independently compute stochastic gradients, and cooperate to jointly optimize over their …

被引用次数：77 相关文章所有 3 个版本

[PDF] mlr.press

Robust training in high dimensions via block coordinate geometric median descent

A Acharya, A Hashemi, P Jain… - International …, 2022 - proceedings.mlr.press

Geometric median (GM) is a classical method in statistics for achieving robust estimation of
the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 1/2 …

被引用次数：32 相关文章所有 9 个版本

Byzantine-robust and communication-efficient distributed non-convex learning over non-IID data

X He, H Zhu, Q Ling - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org

Motivated by the emerging federated learning applications, we jointly consider the problems
of Byzantine-robustness and communication efficiency in distributed non-convex learning …

被引用次数：7 相关文章

C-RSA: Byzantine-robust and communication-efficient distributed learning in the non-convex and non-IID regime

X He, H Zhu, Q Ling - Signal Processing, 2023 - Elsevier

The emerging federated learning applications raise challenges of Byzantine-robustness and
communication efficiency in distributed non-convex learning over non-IID data. To address …

被引用次数：1 相关文章所有 4 个版本

Byzantine resilient non-convex SCSG with distributed batch gradient computations

S Bulusu, P Khanduri, S Kafle… - … on Signal and …, 2021 - ieeexplore.ieee.org

Distributed learning is an important paradigm in the current machine learning algorithms
with large datasets. In this paper, distributed stochastic optimization problem of minimizing a …

被引用次数：4 相关文章所有 2 个版本

Asynchronous SGD with stale gradient dynamic adjustment for deep learning training

T Tan, H Xie, Y Xia, X Shi, M Shang - Information Sciences, 2024 - Elsevier

Asynchronous stochastic gradient descent (ASGD) is a computationally efficient algorithm,
which speeds up deep learning training and plays an important role in distributed deep …

[PDF] neurips.cc

Communication-efficient distributed eigenspace estimation with arbitrary node failures

V Charisopoulos, A Damle - Advances in Neural …, 2022 - proceedings.neurips.cc

We develop an eigenspace estimation algorithm for distributed environments with arbitrary
node failures, where a subset of computing nodes can return structurally valid but otherwise …

被引用次数：2 相关文章所有 9 个版本

On Robust Machine Learning in the Presence of Adversaries

S Bulusu - 2023 - search.proquest.com

In today's highly connected world, the number of smart devices worldwide has increased
exponentially. These devices generate huge amounts of real-time data, perform complicated …

Arbitrarily Accurate Aggregation Scheme for Byzantine SGD

A Maurer - 25th International Conference on Principles of …, 2022 - drops.dagstuhl.de

A very common optimization technique in Machine Learning is Stochastic Gradient Descent
(SGD). SGD can easily be distributed: several workers try to estimate the gradient of a loss …

被引用次数：1 相关文章所有 6 个版本

[PDF] cornell.edu

COMPUTATIONALLY EFFICIENT AND ROBUST METHODS FOR LARGE-SCALE OPTIMIZATION AND SCIENTIFIC COMPUTING

V Charisopoulos - 2023 - ecommons.cornell.edu

This thesis is concerned with the design and analysis of computationally efficient algorithms
for large-scale optimization and scientific computing. It aims to address two primary …