Fine-grained theoretical analysis of federated zeroth-order optimization
Federated zeroth-order optimization (FedZO) algorithm enjoys the advantages of both zeroth-
order optimization and federated learning, and has shown exceptional performance on …
order optimization and federated learning, and has shown exceptional performance on …
DPZero: dimension-independent and differentially private zeroth-order optimization
The widespread practice of fine-tuning pretrained large language models (LLMs) on domain-
specific data faces two major challenges in memory and privacy. First, as the size of LLMs …
specific data faces two major challenges in memory and privacy. First, as the size of LLMs …
Gradient is all you need?
In this paper we provide a novel analytical perspective on the theoretical understanding of
gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a …
gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a …
Black-box tests for algorithmic stability
Algorithmic stability is a concept from learning theory that expresses the degree to which
changes to the input data (eg removal of a single data point) may affect the outputs of a …
changes to the input data (eg removal of a single data point) may affect the outputs of a …
Toward better PAC-bayes bounds for uniformly stable algorithms
We give sharper bounds for uniformly stable randomized algorithms in a PAC-Bayesian
framework, which improve the existing results by up to a factor of $\sqrt {n} $(ignoring a log …
framework, which improve the existing results by up to a factor of $\sqrt {n} $(ignoring a log …
High-probability generalization bounds for pointwise uniformly stable algorithms
Algorithmic stability is a fundamental concept in statistical learning theory to understand the
generalization behavior of optimization algorithms. Existing high-probability bounds are …
generalization behavior of optimization algorithms. Existing high-probability bounds are …
Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
Decentralized Stochastic Gradient Descent (D-SGD) represents an efficient communication
approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel …
approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel …
Select without fear: Almost all mini-batch schedules generalize optimally
We establish matching upper and lower generalization error bounds for mini-batch Gradient
Descent (GD) training with either deterministic or stochastic, data-independent, but …
Descent (GD) training with either deterministic or stochastic, data-independent, but …
Obtaining Lower Query Complexities Through Lightweight Zeroth-Order Proximal Gradient Algorithms
Zeroth-order (ZO) optimization is one key technique for machine learning problems where
gradient calculation is expensive or impossible. Several variance, reduced ZO proximal …
gradient calculation is expensive or impossible. Several variance, reduced ZO proximal …
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Federated Learning (FL) offers a promising framework for collaborative and privacy-
preserving machine learning across distributed data sources. However, the substantial …
preserving machine learning across distributed data sources. However, the substantial …