Clustering approaches for data with missing values: Comparison and evaluation

E Eirola, G Doquire, M Verleysen, A Lendasse - Information Sciences, 2013 - Elsevier

The possibility of missing or incomplete data is often ignored when describing statistical or
machine learning methods, but as it is a common problem in practice, it is relevant to …

被引用次数：68 相关文章所有 10 个版本

[PDF] ieee.org

FINNIM: Iterative imputation of missing values in dissolved gas analysis dataset

Z Sahri, R Yusof, J Watada - IEEE Transactions on Industrial …, 2014 - ieeexplore.ieee.org

Missing values are a common occurrence in a number of real world databases, and
statistical methods have been developed to deal with this problem, referred to as missing …

被引用次数：57 相关文章所有 5 个版本

[PDF] neurips.cc

Coresets for clustering with missing values

V Braverman, S Jiang… - Advances in Neural …, 2021 - proceedings.neurips.cc

We provide the first coreset for clustering points in $\mathbb {R}^ d $ that have multiple
missing values (coordinates). Previous coreset constructions only allow one missing …

被引用次数：13 相关文章所有 10 个版本

[PDF] springer.com

Clustering with missing features: a penalized dissimilarity measure based approach

S Datta, S Bhattacharjee, S Das - Machine Learning, 2018 - Springer

Many real-world clustering problems are plagued by incomplete data characterized by
missing or absent features for some or all of the data instances. Traditional clustering …

被引用次数：30 相关文章所有 8 个版本

CKNNI: an improved knn-based missing value handling technique

C Jiang, Z Yang - … Intelligent Computing Theories and Applications: 11th …, 2015 - Springer

In data mining field, experimental data sets are often incomplete due to the imperfect nature
of real world situations. However, the incompleteness of data sets generally leads to biased …

被引用次数：30 相关文章

[PDF] researchgate.net

What are clusters in high dimensions and are they difficult to find?

F Klawonn, F Höppner, B Jayaram - … , CHDD 2012, Naples, Italy, May 15 …, 2015 - Springer

The distribution of distances between points in a high-dimensional data set tends to look
quite different from the distribution of the distances in a low-dimensional data set …

被引用次数：31 相关文章所有 7 个版本

[PDF] karger.com

Know your monkey: identifying primate conservation challenges in an indigenous Kichwa community using an ethnoprimatological approach

CA Stafford, J Alarcon-Valenzuela, J Patiño… - Folia Primatologica, 2016 - brill.com

Increasing pressure on tropical forests is continually highlighting the need to find new
solutions that mitigate the impact of human populations on biodiversity. However …

被引用次数：27 相关文章所有 10 个版本

[PDF] arxiv.org

An efficient k‐means‐type algorithm for clustering datasets with incomplete records

A Lithio, R Maitra - Statistical Analysis and Data Mining: The …, 2018 - Wiley Online Library

The k‐means algorithm is arguably the most popular nonparametric clustering method but
cannot generally be applied to datasets with incomplete records. The usual practice then is …

被引用次数：14 相关文章所有 9 个版本

Imputation method of missing values for dissolved gas analysis data based on iterative KNN and XGBoost

L Qiao, R Ran, H Wu, Q Zhou, S Liu, Y Liu - Proceedings of the 2018 …, 2018 - dl.acm.org

Power transformers are an important part of the power system. Accurate monitoring of its
operating status is particularly important for the normal and stable operation of the entire …

被引用次数：10 相关文章

[PDF] siam.org

Making kernel density estimation robust towards missing values in highly incomplete multivariate data without imputation

R Leibrandt, S Günnemann - Proceedings of the 2018 SIAM International …, 2018 - SIAM

Density estimation is one of the most frequently used data analytics techniques. A major
challenge of real-world datasets is missing values, originating eg from sampling errors or …

被引用次数：11 相关文章所有 2 个版本