Fine-grained theoretical analysis of federated zeroth-order optimization

J Chen, H Chen, B Gu, H Deng - Advances in Neural …, 2024 - proceedings.neurips.cc
Federated zeroth-order optimization (FedZO) algorithm enjoys the advantages of both zeroth-
order optimization and federated learning, and has shown exceptional performance on …

DPZero: dimension-independent and differentially private zeroth-order optimization

L Zhang, KK Thekumparampil, S Oh… - International Workshop on …, 2023 - openreview.net
The widespread practice of fine-tuning pretrained large language models (LLMs) on domain-
specific data faces two major challenges in memory and privacy. First, as the size of LLMs …

Gradient is all you need?

K Riedl, T Klock, C Geldhauser, M Fornasier - arXiv preprint arXiv …, 2023 - arxiv.org
In this paper we provide a novel analytical perspective on the theoretical understanding of
gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a …

Black-box tests for algorithmic stability

B Kim, RF Barber - Information and Inference: A Journal of the …, 2023 - academic.oup.com
Algorithmic stability is a concept from learning theory that expresses the degree to which
changes to the input data (eg removal of a single data point) may affect the outputs of a …

Toward better PAC-bayes bounds for uniformly stable algorithms

S Zhou, Y Lei, A Kabán - Advances in Neural Information …, 2023 - proceedings.neurips.cc
We give sharper bounds for uniformly stable randomized algorithms in a PAC-Bayesian
framework, which improve the existing results by up to a factor of $\sqrt {n} $(ignoring a log …

High-probability generalization bounds for pointwise uniformly stable algorithms

J Fan, Y Lei - Applied and Computational Harmonic Analysis, 2024 - Elsevier
Algorithmic stability is a fundamental concept in statistical learning theory to understand the
generalization behavior of optimization algorithms. Existing high-probability bounds are …

Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent

J Wang, H Chen - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Decentralized Stochastic Gradient Descent (D-SGD) represents an efficient communication
approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel …

Select without fear: Almost all mini-batch schedules generalize optimally

KE Nikolakakis, A Karbasi, D Kalogerias - arXiv preprint arXiv:2305.02247, 2023 - arxiv.org
We establish matching upper and lower generalization error bounds for mini-batch Gradient
Descent (GD) training with either deterministic or stochastic, data-independent, but …

Obtaining Lower Query Complexities Through Lightweight Zeroth-Order Proximal Gradient Algorithms

B Gu, X Wei, H Zhang, Y Chang, H Huang - Neural Computation, 2024 - direct.mit.edu
Zeroth-order (ZO) optimization is one key technique for machine learning problems where
gradient calculation is expensive or impossible. Several variance, reduced ZO proximal …

Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization

Z Li, B Ying, Z Liu, H Yang - arXiv preprint arXiv:2405.15861, 2024 - arxiv.org
Federated Learning (FL) offers a promising framework for collaborative and privacy-
preserving machine learning across distributed data sources. However, the substantial …