User-friendly introduction to PAC-Bayes bounds

P Alquier - Foundations and Trends® in Machine Learning, 2024 - nowpublishers.com
Aggregated predictors are obtained by making a set of basic predictors vote according to
some weights, that is, to some probability distribution. Randomized predictors are obtained …

Recent advances in deep learning theory

F He, D Tao - arXiv preprint arXiv:2012.10931, 2020 - arxiv.org
Deep learning is usually described as an experiment-driven field under continuous criticizes
of lacking theoretical foundations. This problem has been partially fixed by a large volume of …

Reasoning about generalization via conditional mutual information

T Steinke, L Zakynthinou - Conference on Learning Theory, 2020 - proceedings.mlr.press
We provide an information-theoretic framework for studying the generalization properties of
machine learning algorithms. Our framework ties together existing approaches, including …

Tightening mutual information-based bounds on generalization error

Y Bu, S Zou, VV Veeravalli - IEEE Journal on Selected Areas in …, 2020 - ieeexplore.ieee.org
An information-theoretic upper bound on the generalization error of supervised learning
algorithms is derived. The bound is constructed in terms of the mutual information between …

Information-theoretic generalization bounds for stochastic gradient descent

G Neu, GK Dziugaite, M Haghifam… - … on Learning Theory, 2021 - proceedings.mlr.press
We study the generalization properties of the popular stochastic optimization method known
as stochastic gradient descent (SGD) for optimizing general non-convex loss functions. Our …

Sharpened generalization bounds based on conditional mutual information and an application to noisy, iterative algorithms

M Haghifam, J Negrea, A Khisti… - Advances in …, 2020 - proceedings.neurips.cc
The information-theoretic framework of Russo and Zou (2016) and Xu and Raginsky (2017)
provides bounds on the generalization error of a learning algorithm in terms of the mutual …

The dynamics of sharpness-aware minimization: Bouncing across ravines and drifting towards wide minima

PL Bartlett, PM Long, O Bousquet - Journal of Machine Learning Research, 2023 - jmlr.org
We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method
for deep networks that has exhibited performance improvements on image and language …

On the role of data in PAC-Bayes bounds

GK Dziugaite, K Hsu, W Gharbieh… - International …, 2021 - proceedings.mlr.press
The dominant term in PAC-Bayes bounds is often the Kullback-Leibler divergence between
the posterior and prior. For so-called linear PAC-Bayes risk bounds based on the empirical …

Randomized adversarial training via taylor expansion

G Jin, X Yi, D Wu, R Mu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In recent years, there has been an explosion of research into developing more robust deep
neural networks against adversarial examples. Adversarial training appears as one of the …

Shape matters: Understanding the implicit bias of the noise covariance

JZ HaoChen, C Wei, J Lee… - Conference on Learning …, 2021 - proceedings.mlr.press
The noise in stochastic gradient descent (SGD) provides a crucial implicit regularization
effect for training overparameterized models. Prior theoretical work largely focuses on …