Adaptive sampling for k-means clustering

M Moshkovitz, S Dasgupta… - … on machine learning, 2020 - proceedings.mlr.press

Many clustering algorithms lead to cluster assignments that are hard to explain, partially
because they depend on all the features of the data in a complicated way. To improve …

被引用次数：198 相关文章所有 6 个版本

[PDF] siam.org

Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering

D Feldman, M Schmidt, C Sohler - SIAM Journal on Computing, 2020 - SIAM

We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …

被引用次数：667 相关文章所有 13 个版本

An effective and efficient algorithm for K-means clustering with new formulation

F Nie, Z Li, R Wang, X Li - IEEE Transactions on Knowledge …, 2022 - ieeexplore.ieee.org

K-means is one of the most simple and popular clustering algorithms, which implemented as
a standard clustering method in most of machine learning researches. The goal of K-means …

被引用次数：82 相关文章所有 3 个版本

Remaining discharge energy estimation for lithium-ion batteries based on future load prediction considering temperature and ageing effects

X Lai, Y Huang, H Gu, X Han, X Feng, H Dai, Y Zheng… - Energy, 2022 - Elsevier

The estimation of remaining discharge energy (RDE) of lithium-ion batteries is the basis for
the remaining driving range estimation of electric vehicles. The RDE estimation is affected …

被引用次数：64 相关文章所有 5 个版本

[PDF] siam.org

Streamkm++ a clustering algorithm for data streams

MR Ackermann, M Märtens, C Raupach… - Journal of Experimental …, 2012 - dl.acm.org

We develop a new k-means clustering algorithm for data streams of points from a Euclidean
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …

被引用次数：520 相关文章所有 12 个版本

[PDF] caltech.edu

The effectiveness of Lloyd-type methods for the k-means problem

R Ostrovsky, Y Rabani, LJ Schulman… - Journal of the ACM …, 2013 - dl.acm.org

We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt
to explain its popularity (a half century after its introduction) among practitioners, and in …

被引用次数：627 相关文章所有 28 个版本

[PDF] neurips.cc

Fast and provably good seedings for k-means

O Bachem, M Lucic, H Hassani… - Advances in neural …, 2016 - proceedings.neurips.cc

Seeding-the task of finding initial cluster centers-is critical in obtaining high-quality
clusterings for k-Means. However, k-means++ seeding, the state of the art algorithm, does …

被引用次数：198 相关文章所有 11 个版本

[PDF] arxiv.org

Statistical and computational guarantees of lloyd's algorithm and its variants

Y Lu, HH Zhou - arXiv preprint arXiv:1612.02099, 2016 - arxiv.org

Clustering is a fundamental problem in statistics and machine learning. Lloyd's algorithm,
proposed in 1957, is still possibly the most widely used clustering algorithm in practice due …

被引用次数：166 相关文章所有 3 个版本

Core-sets: Updated survey

D Feldman - Sampling techniques for supervised or unsupervised …, 2020 - Springer

In optimization or machine learning problems we are given a set of items, usually points in
some metric space, and the goal is to minimize or maximize an objective function over some …

被引用次数：110 相关文章所有 5 个版本

[PDF] aaai.org

Approximate k-means++ in sublinear time

O Bachem, M Lucic, SH Hassani… - Proceedings of the AAAI …, 2016 - ojs.aaai.org

The quality of K-Means clustering is extremely sensitive to proper initialization. The classic
remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive …

被引用次数：176 相关文章所有 11 个版本