Explainable k-means and k-medians clustering
M Moshkovitz, S Dasgupta… - … on machine learning, 2020 - proceedings.mlr.press
Many clustering algorithms lead to cluster assignments that are hard to explain, partially
because they depend on all the features of the data in a complicated way. To improve …
because they depend on all the features of the data in a complicated way. To improve …
Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering
We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …
An effective and efficient algorithm for K-means clustering with new formulation
K-means is one of the most simple and popular clustering algorithms, which implemented as
a standard clustering method in most of machine learning researches. The goal of K-means …
a standard clustering method in most of machine learning researches. The goal of K-means …
Remaining discharge energy estimation for lithium-ion batteries based on future load prediction considering temperature and ageing effects
The estimation of remaining discharge energy (RDE) of lithium-ion batteries is the basis for
the remaining driving range estimation of electric vehicles. The RDE estimation is affected …
the remaining driving range estimation of electric vehicles. The RDE estimation is affected …
Streamkm++ a clustering algorithm for data streams
MR Ackermann, M Märtens, C Raupach… - Journal of Experimental …, 2012 - dl.acm.org
We develop a new k-means clustering algorithm for data streams of points from a Euclidean
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …
The effectiveness of Lloyd-type methods for the k-means problem
R Ostrovsky, Y Rabani, LJ Schulman… - Journal of the ACM …, 2013 - dl.acm.org
We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt
to explain its popularity (a half century after its introduction) among practitioners, and in …
to explain its popularity (a half century after its introduction) among practitioners, and in …
Fast and provably good seedings for k-means
Seeding-the task of finding initial cluster centers-is critical in obtaining high-quality
clusterings for k-Means. However, k-means++ seeding, the state of the art algorithm, does …
clusterings for k-Means. However, k-means++ seeding, the state of the art algorithm, does …
Statistical and computational guarantees of lloyd's algorithm and its variants
Clustering is a fundamental problem in statistics and machine learning. Lloyd's algorithm,
proposed in 1957, is still possibly the most widely used clustering algorithm in practice due …
proposed in 1957, is still possibly the most widely used clustering algorithm in practice due …
Core-sets: Updated survey
D Feldman - Sampling techniques for supervised or unsupervised …, 2020 - Springer
In optimization or machine learning problems we are given a set of items, usually points in
some metric space, and the goal is to minimize or maximize an objective function over some …
some metric space, and the goal is to minimize or maximize an objective function over some …
Approximate k-means++ in sublinear time
The quality of K-Means clustering is extremely sensitive to proper initialization. The classic
remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive …
remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive …