Lower bounds and optimal algorithms for personalized federated learning
In this work, we consider the optimization formulation of personalized federated learning
recently introduced by Hanzely & Richtarik (2020) which was shown to give an alternative …
recently introduced by Hanzely & Richtarik (2020) which was shown to give an alternative …
Stochastic gradient descent for hybrid quantum-classical optimization
Within the context of hybrid quantum-classical optimization, gradient descent based
optimizers typically require the evaluation of expectation values with respect to the outcome …
optimizers typically require the evaluation of expectation values with respect to the outcome …
Better theory for SGD in the nonconvex world
A Khaled, P Richtárik - arXiv preprint arXiv:2002.03329, 2020 - arxiv.org
Large-scale nonconvex optimization problems are ubiquitous in modern machine learning,
and among practitioners interested in solving them, Stochastic Gradient Descent (SGD) …
and among practitioners interested in solving them, Stochastic Gradient Descent (SGD) …
Random reshuffling: Simple analysis with vast improvements
K Mishchenko, A Khaled… - Advances in Neural …, 2020 - proceedings.neurips.cc
Random Reshuffling (RR) is an algorithm for minimizing finite-sum functions that utilizes
iterative gradient descent steps in conjunction with data reshuffling. Often contrasted with its …
iterative gradient descent steps in conjunction with data reshuffling. Often contrasted with its …
Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions
We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …
Optimizing the numbers of queries and replies in convex federated learning with differential privacy
Federated learning (FL) empowers distributed clients to collaboratively train a shared
machine learning model through exchanging parameter information. Despite the fact that FL …
machine learning model through exchanging parameter information. Despite the fact that FL …
Random coordinate descent: a simple alternative for optimizing parameterized quantum circuits
Variational quantum algorithms rely on the optimization of parameterized quantum circuits in
noisy settings. The commonly used back-propagation procedure in classical machine …
noisy settings. The commonly used back-propagation procedure in classical machine …
Asynchronous federated learning with reduced number of rounds and with differential privacy from less aggregated gaussian noise
M van Dijk, NV Nguyen, TN Nguyen… - arXiv preprint arXiv …, 2020 - arxiv.org
The feasibility of federated learning is highly constrained by the server-clients infrastructure
in terms of network communication. Most newly launched smartphones and IoT devices are …
in terms of network communication. Most newly launched smartphones and IoT devices are …
[HTML][HTML] Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates
A Jentzen, P Von Wurstemberger - Journal of Complexity, 2020 - Elsevier
The stochastic gradient descent (SGD) optimization algorithm is one of the central tools used
to approximate solutions of stochastic optimization problems arising in machine learning …
to approximate solutions of stochastic optimization problems arising in machine learning …
Momentum aggregation for private non-convex ERM
H Tran, A Cutkosky - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We introduce new algorithms and convergence guarantees for privacy-preserving non-
convex Empirical Risk Minimization (ERM) on smooth $ d $-dimensional objectives. We …
convex Empirical Risk Minimization (ERM) on smooth $ d $-dimensional objectives. We …