A comprehensive survey of clustering algorithms

D Xu, Y Tian - Annals of data science, 2015 - Springer
Data analysis is used as a common method in modern science research, which is across
communication science, computer science and biology science. Clustering, as the basic …

A survey on unsupervised outlier detection in high‐dimensional numerical data

A Zimek, E Schubert, HP Kriegel - Statistical Analysis and Data …, 2012 - Wiley Online Library
High‐dimensional data in Euclidean space pose special challenges to data mining
algorithms. These challenges are often indiscriminately subsumed under the term 'curse of …

On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study

GO Campos, A Zimek, J Sander… - Data mining and …, 2016 - Springer
The evaluation of unsupervised outlier detection algorithms is a constant challenge in data
mining research. Little is known regarding the strengths and weaknesses of different …

Ensembles for unsupervised outlier detection: challenges and research questions a position paper

A Zimek, RJGB Campello, J Sander - Acm Sigkdd Explorations …, 2014 - dl.acm.org
Ensembles for unsupervised outlier detection is an emerging topic that has been neglected
for a surprisingly long time (although there are reasons why this is more difficult than …

Density-based clustering validation

D Moulavi, PA Jaskowiak, RJGB Campello… - Proceedings of the 2014 …, 2014 - SIAM
One of the most challenging aspects of clustering is validation, which is the objective and
quantitative assessment of clustering results. A number of different relative validity criteria …

The (black) art of runtime evaluation: Are we comparing algorithms or implementations?

HP Kriegel, E Schubert, A Zimek - Knowledge and Information Systems, 2017 - Springer
Any paper proposing a new algorithm should come with an evaluation of efficiency and
scalability (particularly when we are designing methods for “big data”). However, there are …

Validation of cluster analysis results on validation data: A systematic framework

T Ullmann, C Hennig… - … Reviews: Data Mining …, 2022 - Wiley Online Library
Cluster analysis refers to a wide range of data analytic techniques for class discovery and is
popular in many application fields. To assess the quality of a clustering result, different …

A survey on enhanced subspace clustering

K Sim, V Gopalkrishnan, A Zimek, G Cong - Data mining and knowledge …, 2013 - Springer
Subspace clustering finds sets of objects that are homogeneous in subspaces of high-
dimensional datasets, and has been successfully applied in many domains. In recent years …

A collection of benchmark datasets for systematic evaluations of machine learning on the semantic web

P Ristoski, GKD De Vries, H Paulheim - … Kobe, Japan, October 17–21, 2016 …, 2016 - Springer
In the recent years, several approaches for machine learning on the Semantic Web have
been proposed. However, no extensive comparisons between those approaches have been …

On using classification datasets to evaluate graph outlier detection: Peculiar observations and new insights

L Zhao, L Akoglu - Big Data, 2023 - liebertpub.com
It is common practice of the outlier mining community to repurpose classification datasets
toward evaluating various detection models. To that end, often a binary classification dataset …