[PDF][PDF] A similarity measure for clustering and its applications

GJ Torres, RB Basnet, AH Sung, S Mukkamala… - Int J Electr Comput Syst …, 2009 - cs.nmt.edu
GJ Torres, RB Basnet, AH Sung, S Mukkamala, BM Ribeiro
Int J Electr Comput Syst Eng, 2009cs.nmt.edu
This paper introduces a measure of similarity between two clusterings of the same dataset
produced by two different algorithms, or even the same algorithm (K-means, for instance,
with different initializations usually produce different results in clustering the same dataset).
We then apply the measure to calculate the similarity between pairs of clusterings, with
special interest directed at comparing the similarity between various machine clusterings
and human clustering of datasets. The similarity measure thus can be used to identify the …
Abstract
This paper introduces a measure of similarity between two clusterings of the same dataset produced by two different algorithms, or even the same algorithm (K-means, for instance, with different initializations usually produce different results in clustering the same dataset). We then apply the measure to calculate the similarity between pairs of clusterings, with special interest directed at comparing the similarity between various machine clusterings and human clustering of datasets. The similarity measure thus can be used to identify the best (in terms of most similar to human) clustering algorithm for a specific problem at hand. Experimental results pertaining to the text categorization problem of a Portuguese corpus (wherein a translation-into-English approach is used) are presented, as well as results on the well-known benchmark IRIS dataset. The significance and other potential applications of the proposed measure are discussed.
cs.nmt.edu
以上显示的是最相近的搜索结果。 查看全部搜索结果