To cluster, or not to cluster: An analysis of clusterability methods

A Adolfsson, M Ackerman, NC Brownstein - Pattern Recognition, 2019 - Elsevier
Clustering is an essential data mining tool that aims to discover inherent cluster structure in
data. For most applications, applying clustering is only appropriate when cluster structure is …

Hierarchical clustering: Objective functions and algorithms

V Cohen-Addad, V Kanade, F Mallmann-Trenn… - Journal of the ACM …, 2019 - dl.acm.org
Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly
finer granularity. Motivated by the fact that most work on hierarchical clustering was based …

Better Guarantees for -Means and Euclidean -Median by Primal-Dual Algorithms

S Ahmadian, A Norouzi-Fard, O Svensson… - SIAM Journal on …, 2019 - SIAM
Clustering is a classic topic in optimization with k-means being one of the most fundamental
such problems. In the absence of any restrictions on the input, the best-known algorithm for k …

The effectiveness of Lloyd-type methods for the k-means problem

R Ostrovsky, Y Rabani, LJ Schulman… - Journal of the ACM …, 2013 - dl.acm.org
We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt
to explain its popularity (a half century after its introduction) among practitioners, and in …

Approximating k-median via pseudo-approximation

S Li, O Svensson - proceedings of the forty-fifth annual ACM symposium …, 2013 - dl.acm.org
We present a novel approximation algorithm for k-median that achieves an approximation
guarantee of 1+√ 3+ ε, improving upon the decade-old ratio of 3+ ε. Our approach is based …

Local Search Yields Approximation Schemes for -Means and -Median in Euclidean and Minor-Free Metrics

V Cohen-Addad, PN Klein, C Mathieu - SIAM Journal on Computing, 2019 - SIAM
We give the first polynomial-time approximation schemes (PTASs) for the following
problems:(1) uniform facility location in edge-weighted planar graphs;(2) k-median and k …

Improved approximations for Euclidean k-means and k-median, via nested quasi-independent sets

V Cohen-Addad, H Esfandiari, V Mirrokni… - Proceedings of the 54th …, 2022 - dl.acm.org
Motivated by data analysis and machine learning applications, we consider the popular high-
dimensional Euclidean k-median and k-means problems. We propose a new primal-dual …

Constant approximation for k-median and k-means with outliers via iterative rounding

R Krishnaswamy, S Li, S Sandeep - Proceedings of the 50th annual ACM …, 2018 - dl.acm.org
In this paper, we present a new iterative rounding framework for many clustering problems.
Using this, we obtain an (α1+ є≤ 7.081+ є)-approximation algorithm for k-median with …

Local Search Yields a PTAS for -Means in Doubling Metrics

Z Friggstad, M Rezapour, MR Salavatipour - SIAM Journal on Computing, 2019 - SIAM
The most well-known and ubiquitous clustering problem encountered in nearly every branch
of science is undoubtedly k-means: given a set of data points and a parameter k, select k …

Theoretical Analysis of the k-Means Algorithm – A Survey

J Blömer, C Lammersen, M Schmidt… - … : Selected Results and …, 2016 - Springer
The k-means algorithm is one of the most widely used clustering heuristics. Despite its
simplicity, analyzing its running time and quality of approximation is surprisingly difficult and …