Fair clustering through fairlets

F Chierichetti, R Kumar, S Lattanzi… - Advances in neural …, 2017 - proceedings.neurips.cc
We study the question of fair clustering under the {\em disparate impact} doctrine, where
each protected class must have approximately equal representation in every cluster. We …

Fair algorithms for clustering

S Bera, D Chakrabarty, N Flores… - Advances in Neural …, 2019 - proceedings.neurips.cc
We study the problem of finding low-cost {\em fair clusterings} in data where each data point
may belong to many protected groups. Our work significantly generalizes the seminal work …

Consistency of spectral clustering in stochastic block models

J Lei, A Rinaldo - The Annals of Statistics, 2015 - JSTOR
We analyze the performance of spectral clustering for community extraction in stochastic
block models. We show that, under mild conditions, spectral clustering applied to the …

Better Guarantees for -Means and Euclidean -Median by Primal-Dual Algorithms

S Ahmadian, A Norouzi-Fard, O Svensson… - SIAM Journal on …, 2019 - SIAM
Clustering is a classic topic in optimization with k-means being one of the most fundamental
such problems. In the absence of any restrictions on the input, the best-known algorithm for k …

Fair k-center clustering for data summarization

M Kleindessner, P Awasthi… - … on Machine Learning, 2019 - proceedings.mlr.press
In data summarization we want to choose $ k $ prototypes in order to summarize a data set.
We study a setting where the data set comprises several demographic groups and we are …

An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization

J Byrka, T Pensyl, B Rybicki, A Srinivasan… - ACM Transactions on …, 2017 - dl.acm.org
Dependent rounding is a useful technique for optimization problems with hard budget
constraints. This framework naturally leads to negative correlation properties. However, what …

Towards optimal lower bounds for k-median and k-means coresets

V Cohen-Addad, KG Larsen, D Saulpic… - Proceedings of the 54th …, 2022 - dl.acm.org
The (k, z)-clustering problem consists of finding a set of k points called centers, such that the
sum of distances raised to the power of z of every data point to its closest center is …

Fair clustering via equitable group representations

M Abbasi, A Bhaskara… - Proceedings of the 2021 …, 2021 - dl.acm.org
What does it mean for a clustering to be fair? One popular approach seeks to ensure that
each cluster contains groups in (roughly) the same proportion in which they exist in the …

Network cross-validation for determining the number of communities in network data

K Chen, J Lei - Journal of the American Statistical Association, 2018 - Taylor & Francis
The stochastic block model (SBM) and its variants have been a popular tool for analyzing
large network data with community structures. In this article, we develop an efficient network …

Dissimilarity-based sparse subset selection

E Elhamifar, G Sapiro, SS Sastry - IEEE transactions on pattern …, 2015 - ieeexplore.ieee.org
Finding an informative subset of a large collection of data points or models is at the center of
many problems in computer vision, recommender systems, bio/health informatics as well as …