The min-max complexity of distributed stochastic convex optimization with intermittent communication
We resolve the min-max complexity of distributed stochastic convex optimization (up to a log
factor) in the intermittent communication setting, where $ M $ machines work in parallel over …
factor) in the intermittent communication setting, where $ M $ machines work in parallel over …
Towards optimal communication complexity in distributed non-convex optimization
We study the problem of distributed stochastic non-convex optimization with intermittent
communication. We consider the full participation setting where $ M $ machines work in …
communication. We consider the full participation setting where $ M $ machines work in …
Federated online and bandit convex optimization
We study the problems of distributed online and bandit convex optimization against an
adaptive adversary. We aim to minimize the average regret on $ M $ machines working in …
adaptive adversary. We aim to minimize the average regret on $ M $ machines working in …
The limits and potentials of local sgd for distributed heterogeneous learning with intermittent communication
Local SGD is a popular optimization method in distributed learning, often outperforming
other algorithms in practice, including mini-batch SGD. Despite this success, theoretically …
other algorithms in practice, including mini-batch SGD. Despite this success, theoretically …
Flecs: A federated learning second-order framework via compression and sketching
Inspired by the recent work FedNL (Safaryan et al, FedNL: Making Newton-Type Methods
Applicable to Federated Learning), we propose a new communication efficient second-order …
Applicable to Federated Learning), we propose a new communication efficient second-order …
Distributed online and bandit convex optimization
We study the problems of distributed online and bandit convex optimization against an
adaptive adversary. Our goal is to minimize the average regret on M machines working in …
adaptive adversary. Our goal is to minimize the average regret on M machines working in …
Exploiting higher-order derivatives in convex optimization methods
Exploiting higher-order derivatives in convex optimization is known at least since 1970's. In
each iteration higher-order (also called tensor) methods minimize a regularized Taylor …
each iteration higher-order (also called tensor) methods minimize a regularized Taylor …
Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning
Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves
multiple tasks at the same time. This paper presents a constrained formulation for multi-task …
multiple tasks at the same time. This paper presents a constrained formulation for multi-task …
FLECS-CGD: A Federated Learning Second-Order Framework via Compression and Sketching with Compressed Gradient Differences
In the recent paper FLECS (Agafonov et al, FLECS: A Federated Learning Second-Order
Framework via Compression and Sketching), the second-order framework FLECS was …
Framework via Compression and Sketching), the second-order framework FLECS was …
[HTML][HTML] Privacy-Preserving Distributed Learning via Newton Algorithm
Federated learning (FL) is a prominent distributed learning framework. The main barriers of
FL include communication cost and privacy breaches. In this work, we propose a novel …
FL include communication cost and privacy breaches. In this work, we propose a novel …