A primer on PAC-Bayesian learning

B Guedj - arXiv preprint arXiv:1901.05353, 2019 - arxiv.org
Generalised Bayesian learning algorithms are increasingly popular in machine learning,
due to their PAC generalisation properties and flexibility. The present paper aims at …

Power k-means clustering

J Xu, K Lange - International conference on machine …, 2019 - proceedings.mlr.press
Clustering is a fundamental task in unsupervised machine learning. Lloyd's 1957 algorithm
for k-means clustering remains one of the most widely used due to its speed and simplicity …

Fast rates for general unbounded loss functions: from ERM to generalized Bayes

PD Grünwald, NA Mehta - Journal of Machine Learning Research, 2020 - jmlr.org
We present new excess risk bounds for general unbounded loss functions including log loss
and squared loss, where the distribution of the losses may be heavy-tailed. The bounds hold …

[HTML][HTML] Simpler PAC-Bayesian bounds for hostile data

P Alquier, B Guedj - Machine Learning, 2018 - Springer
PAC-Bayesian learning bounds are of the utmost interest to the learning community. Their
role is to connect the generalization ability of an aggregation distribution ρ ρ to its empirical …

-regression with Heavy-tailed Distributions

L Zhang, ZH Zhou - Advances in Neural Information …, 2018 - proceedings.neurips.cc
In this paper, we consider the problem of linear regression with heavy-tailed distributions.
Different from previous studies that use the squared loss to measure the performance, we …

Learning with non-convex truncated losses by SGD

Y Xu, S Zhu, S Yang, C Zhang… - Uncertainty in Artificial …, 2020 - proceedings.mlr.press
Learning with a convex loss function has been a dominating paradigm for many years. It
remains an interesting question how non-convex loss functions help improve the …

Uniform deviation bounds for k-means clustering

O Bachem, M Lucic, SH Hassani… - … on machine learning, 2017 - proceedings.mlr.press
Uniform deviation bounds limit the difference between a model's expected loss and its loss
on an empirical sample uniformly for all models in a learning problem. In this paper, we …

An effective AQI estimation using sensor data and stacking mechanism

DQ Duong, QM Le, TL Nguyen-Tai… - New Trends in …, 2021 - ebooks.iospress.nl
Accurately assessing the air quality index (AQI) values and levels has become an attractive
research topic during the last decades. It is a crucial aspect when studying the possible …

On strong consistency of kernel k-means: A Rademacher complexity approach

A Chakrabarty, S Das - Statistics & Probability Letters, 2022 - Elsevier
We provide uniform concentration bounds on the kernel k-means clustering objective based
on Rademacher complexity by posing the underlying problem as a risk minimization task …

Bagging Improves Generalization Exponentially

H Jie, D Ying, H Lam, W Yin - arXiv preprint arXiv:2405.14741, 2024 - arxiv.org
Bagging is a popular ensemble technique to improve the accuracy of machine learning
models. It hinges on the well-established rationale that, by repeatedly retraining on …