Learning the k in k-means

G Hamerly, C Elkan - Advances in neural information …, 2003 - proceedings.neurips.cc
When clustering a dataset, the right number k of clusters to use is often not obvious, and
choosing k automatically is a hard algorithmic problem. In this paper we present an …

CNAK: Cluster number assisted K-means

J Saha, J Mukherjee - Pattern Recognition, 2021 - Elsevier
The K-means clustering algorithm is well-known for its easy computational approach. In this
algorithm, essential cluster-level information is captured by the K cluster centroids. However …

[PDF][PDF] Understanding K-means non-hierarchical clustering

I Davidson - Computer Science Department of State University of …, 2002 - Citeseer
The K-means algorithm is a popular approach to finding clusters due to its simplicity of
implementation and fast execution. It appears extensively in the machine learning literature …

Adapting the right measures for k-means clustering

J Wu, H Xiong, J Chen - Proceedings of the 15th ACM SIGKDD …, 2009 - dl.acm.org
Clustering validation is a long standing challenge in the clustering literature. While many
validation measures have been developed for evaluating the performance of clustering …

K-means properties on six clustering benchmark datasets

P Fränti, S Sieranoja - Applied intelligence, 2018 - Springer
This paper has two contributions. First, we introduce a clustering basic benchmark. Second,
we study the performance of k-means using this benchmark. Specifically, we measure how …

Learning-Augmented -means Clustering

JC Ergun, Z Feng, S Silwal, DP Woodruff… - arXiv preprint arXiv …, 2021 - arxiv.org
$ k $-means clustering is a well-studied problem due to its wide applicability. Unfortunately,
there exist strong theoretical limits on the performance of any algorithm for the $ k $-means …

A method for initialising the K-means clustering algorithm using kd-trees

SJ Redmond, C Heneghan - Pattern recognition letters, 2007 - Elsevier
We present a method for initialising the K-means clustering algorithm. Our method hinges on
the use of a kd-tree to perform a density estimation of the data at various locations. We then …

K-means clustering versus validation measures: a data distribution perspective

H Xiong, J Wu, J Chen - Proceedings of the 12th ACM SIGKDD …, 2006 - dl.acm.org
K-means is a widely used partitional clustering method. While there are considerable
research efforts to characterize the key features of K-means clustering, further investigation …

Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

MMT Chiang, B Mirkin - Journal of classification, 2010 - Springer
The issue of determining “the right number of clusters” in K-Means has attracted
considerable interest, especially in the recent years. Cluster intermix appears to be a factor …

A deterministic method for initializing k-means clustering

T Su, J Dy - 16th IEEE international conference on tools with …, 2004 - ieeexplore.ieee.org
The performance of K-means clustering depends on the initial guess of partition. We
motivate theoretically and experimentally the use of a deterministic divisive hierarchical …