Fair clustering through fairlets
We study the question of fair clustering under the {\em disparate impact} doctrine, where
each protected class must have approximately equal representation in every cluster. We …
each protected class must have approximately equal representation in every cluster. We …
Fair algorithms for clustering
S Bera, D Chakrabarty, N Flores… - Advances in Neural …, 2019 - proceedings.neurips.cc
We study the problem of finding low-cost {\em fair clusterings} in data where each data point
may belong to many protected groups. Our work significantly generalizes the seminal work …
may belong to many protected groups. Our work significantly generalizes the seminal work …
Consistency of spectral clustering in stochastic block models
We analyze the performance of spectral clustering for community extraction in stochastic
block models. We show that, under mild conditions, spectral clustering applied to the …
block models. We show that, under mild conditions, spectral clustering applied to the …
Better Guarantees for -Means and Euclidean -Median by Primal-Dual Algorithms
Clustering is a classic topic in optimization with k-means being one of the most fundamental
such problems. In the absence of any restrictions on the input, the best-known algorithm for k …
such problems. In the absence of any restrictions on the input, the best-known algorithm for k …
Fair k-center clustering for data summarization
M Kleindessner, P Awasthi… - … on Machine Learning, 2019 - proceedings.mlr.press
In data summarization we want to choose $ k $ prototypes in order to summarize a data set.
We study a setting where the data set comprises several demographic groups and we are …
We study a setting where the data set comprises several demographic groups and we are …
An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization
Dependent rounding is a useful technique for optimization problems with hard budget
constraints. This framework naturally leads to negative correlation properties. However, what …
constraints. This framework naturally leads to negative correlation properties. However, what …
Towards optimal lower bounds for k-median and k-means coresets
V Cohen-Addad, KG Larsen, D Saulpic… - Proceedings of the 54th …, 2022 - dl.acm.org
The (k, z)-clustering problem consists of finding a set of k points called centers, such that the
sum of distances raised to the power of z of every data point to its closest center is …
sum of distances raised to the power of z of every data point to its closest center is …
Fair clustering via equitable group representations
M Abbasi, A Bhaskara… - Proceedings of the 2021 …, 2021 - dl.acm.org
What does it mean for a clustering to be fair? One popular approach seeks to ensure that
each cluster contains groups in (roughly) the same proportion in which they exist in the …
each cluster contains groups in (roughly) the same proportion in which they exist in the …
Network cross-validation for determining the number of communities in network data
The stochastic block model (SBM) and its variants have been a popular tool for analyzing
large network data with community structures. In this article, we develop an efficient network …
large network data with community structures. In this article, we develop an efficient network …
Dissimilarity-based sparse subset selection
Finding an informative subset of a large collection of data points or models is at the center of
many problems in computer vision, recommender systems, bio/health informatics as well as …
many problems in computer vision, recommender systems, bio/health informatics as well as …