Asynchronous decentralized parallel stochastic gradient descent
Most commonly used distributed machine learning systems are either synchronous or
centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a …
centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a …
: Decentralized Training over Decentralized Data
While training a machine learning model using multiple workers, each of which collects data
from its own data source, it would be useful when the data collected from different workers …
from its own data source, it would be useful when the data collected from different workers …
Push–pull gradient methods for distributed optimization in networks
In this article, we focus on solving a distributed convex optimization problem in a network,
where each agent has its own convex cost function and the goal is to minimize the sum of …
where each agent has its own convex cost function and the goal is to minimize the sum of …
Communication compression for decentralized training
Optimizing distributed learning systems is an art of balancing between computation and
communication. There have been two lines of research that try to deal with slower …
communication. There have been two lines of research that try to deal with slower …
Exponential graph is provably efficient for decentralized deep training
Decentralized SGD is an emerging training method for deep learning known for its much
less (thus faster) communication per iteration, which relaxes the averaging step in parallel …
less (thus faster) communication per iteration, which relaxes the averaging step in parallel …
A general framework for decentralized optimization with first-order methods
Decentralized optimization to minimize a finite sum of functions, distributed over a network of
nodes, has been a significant area within control and signal-processing research due to its …
nodes, has been a significant area within control and signal-processing research due to its …
Networked signal and information processing: Learning by multiagent systems
This article reviews significant advances in networked signal and information processing
(SIP), which have enabled in the last 25 years extending decision making and inference …
(SIP), which have enabled in the last 25 years extending decision making and inference …
An improved convergence analysis for decentralized online stochastic non-convex optimization
In this paper, we study decentralized online stochastic non-convex optimization over a
network of nodes. Integrating a technique called gradient tracking in decentralized …
network of nodes. Integrating a technique called gradient tracking in decentralized …
Decentralized proximal gradient algorithms with linear convergence rates
This article studies a class of nonsmooth decentralized multiagent optimization problems
where the agents aim at minimizing a sum of local strongly-convex smooth components plus …
where the agents aim at minimizing a sum of local strongly-convex smooth components plus …
Communication-efficient distributed optimization in networks with gradient tracking and variance reduction
There is growing interest in large-scale machine learning and optimization over
decentralized networks, eg in the context of multi-agent learning and federated learning …
decentralized networks, eg in the context of multi-agent learning and federated learning …