A comprehensive survey on coded distributed computing: Fundamentals, challenges, and networking applications
Distributed computing has become a common approach for large-scale computation tasks
due to benefits such as high reliability, scalability, computation speed, and cost …
due to benefits such as high reliability, scalability, computation speed, and cost …
Private retrieval, computing, and learning: Recent progress and future challenges
Most of our lives are conducted in the cyberspace. The human notion of privacy translates
into a cyber notion of privacy on many functions that take place in the cyberspace. This …
into a cyber notion of privacy on many functions that take place in the cyberspace. This …
Federated learning with buffered asynchronous aggregation
Scalability and privacy are two critical concerns for cross-device federated learning (FL)
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …
Lagrange coded computing: Optimal design for resiliency, security, and privacy
We consider a scenario involving computations over a massive dataset stored distributedly
across multiple workers, which is at the core of distributed learning algorithms. We propose …
across multiple workers, which is at the core of distributed learning algorithms. We propose …
Polynomial codes: an optimal design for high-dimensional coded matrix multiplication
Q Yu, M Maddah-Ali… - Advances in Neural …, 2017 - proceedings.neurips.cc
We consider a large-scale matrix multiplication problem where the computation is carried
out using a distributed system with a master node and multiple worker nodes, where each …
out using a distributed system with a master node and multiple worker nodes, where each …
Short-dot: Computing large linear transforms distributedly using coded short dot products
Faced with saturation of Moore's law and increasing size and dimension of data, system
designers have increasingly resorted to parallel and distributed computing to reduce …
designers have increasingly resorted to parallel and distributed computing to reduce …
On the optimal recovery threshold of coded matrix multiplication
S Dutta, M Fahim, F Haddadpour… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
We provide novel coded computation strategies for distributed matrix-matrix products that
outperform the recent “Polynomial code” constructions in recovery threshold, ie, the required …
outperform the recent “Polynomial code” constructions in recovery threshold, ie, the required …
Coded computation over heterogeneous clusters
In large-scale distributed computing clusters, such as Amazon EC2, there are several types
of “system noise” that can result in major degradation of performance: system failures …
of “system noise” that can result in major degradation of performance: system failures …
Communication-computation efficient gradient coding
This paper develops coding techniques to reduce the running time of distributed learning
tasks. It characterizes the fundamental tradeoff to compute gradients in terms of three …
tasks. It characterizes the fundamental tradeoff to compute gradients in terms of three …
GASP codes for secure distributed matrix multiplication
RGL D'Oliveira, S El Rouayheb… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
We consider the problem of secure distributed matrix multiplication (SDMM) in which a user
wishes to compute the product of two matrices with the assistance of honest but curious …
wishes to compute the product of two matrices with the assistance of honest but curious …