Byzantine machine learning: A primer
The problem of Byzantine resilience in distributed machine learning, aka Byzantine machine
learning, consists of designing distributed algorithms that can train an accurate model …
learning, consists of designing distributed algorithms that can train an accurate model …
A survey on quantum channel capacities
Quantum information processing exploits the quantum nature of information. It offers
fundamentally new solutions in the field of computer science and extends the possibilities to …
fundamentally new solutions in the field of computer science and extends the possibilities to …
Data stream clustering: A survey
Data stream mining is an active research area that has recently emerged to discover
knowledge from large amounts of continuously generated data. In this context, several data …
knowledge from large amounts of continuously generated data. In this context, several data …
Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering
We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …
Random forests for big data
Big Data is one of the major challenges of statistical science and has numerous
consequences from algorithmic and theoretical viewpoints. Big Data always involve massive …
consequences from algorithmic and theoretical viewpoints. Big Data always involve massive …
[PDF][PDF] Core vector machines: Fast SVM training on very large data sets.
Standard SVM training has O (m3) time and O (m2) space complexities, where m is the
training set size. It is thus computationally infeasible on very large data sets. By observing …
training set size. It is thus computationally infeasible on very large data sets. By observing …
Submodularity in machine learning and artificial intelligence
J Bilmes - arXiv preprint arXiv:2202.00132, 2022 - arxiv.org
In this manuscript, we offer a gentle review of submodularity and supermodularity and their
properties. We offer a plethora of submodular definitions; a full description of a number of …
properties. We offer a plethora of submodular definitions; a full description of a number of …
On coresets for k-means and k-median clustering
S Har-Peled, S Mazumdar - Proceedings of the thirty-sixth annual ACM …, 2004 - dl.acm.org
In this paper, we show the existence of small coresets for the problems of computing k-
median and k-means clustering for points in low dimension. In other words, we show that …
median and k-means clustering for points in low dimension. In other words, we show that …
Fast approximate spectral clustering
Spectral clustering refers to a flexible class of clustering procedures that can produce high-
quality clusterings on small data sets but which has limited applicability to large-scale …
quality clusterings on small data sets but which has limited applicability to large-scale …
Streamkm++ a clustering algorithm for data streams
MR Ackermann, M Märtens, C Raupach… - Journal of Experimental …, 2012 - dl.acm.org
We develop a new k-means clustering algorithm for data streams of points from a Euclidean
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …