A survey on missing data in machine learning

T Emmanuel, T Maupong, D Mpoeleng, T Semong… - Journal of Big …, 2021 - Springer
Abstract Machine learning has been the corner stone in analysing and extracting information
from data and often a problem of missing values is encountered. Missing values occur …

Missing value imputation: a review and analysis of the literature (2006–2017)

WC Lin, CF Tsai - Artificial Intelligence Review, 2020 - Springer
Missing value imputation (MVI) has been studied for several decades being the basic
solution method for incomplete dataset problems, specifically those where some data …

Learning k for kNN Classification

S Zhang, X Li, M Zong, X Zhu, D Cheng - ACM Transactions on …, 2017 - dl.acm.org
The K Nearest Neighbor (kNN) method has widely been used in the applications of data
mining and machine learning due to its simple implementation and distinguished …

Missing data imputation using statistical and machine learning methods in a real breast cancer problem

JM Jerez, I Molina, PJ García-Laencina, E Alba… - Artificial intelligence in …, 2010 - Elsevier
OBJECTIVES: Missing data imputation is an important task in cases where it is crucial to use
all available data and not discard records with missing values. This work evaluates the …

[HTML][HTML] A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients

MS Santos, PH Abreu, PJ García-Laencina… - Journal of biomedical …, 2015 - Elsevier
Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular
Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess …

A distributed spatial–temporal weighted model on MapReduce for short-term traffic flow forecasting

D Xia, B Wang, H Li, Y Li, Z Zhang - Neurocomputing, 2016 - Elsevier
Accurate and timely traffic flow prediction is crucial to proactive traffic management and
control in data-driven intelligent transportation systems (D 2 ITS), which has attracted great …

Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values

PJ García-Laencina, PH Abreu, MH Abreu… - Computers in biology and …, 2015 - Elsevier
Breast cancer is the most frequently diagnosed cancer in women. Using historical patient
information stored in clinical datasets, data mining and machine learning approaches can …

Missing data imputation by K nearest neighbours based on grey relational structure and mutual information

R Pan, T Yang, J Cao, K Lu, Z Zhang - Applied Intelligence, 2015 - Springer
Abstract Treatment of missing data has become increasingly significant in scientific research
and engineering applications. The classic imputation strategy based on the K nearest …

Missing value imputation using a novel grey based fuzzy c-means, mutual information based feature selection, and regression model

AM Sefidian, N Daneshpour - Expert Systems with Applications, 2019 - Elsevier
The presence of missing values in real-world data is not only a prevalent problem but also
an inevitable one. Therefore, missing values should be handled carefully before the mining …

Clinically applicable machine learning approaches to identify attributes of chronic kidney disease (CKD) for use in low-cost diagnostic screening

M Rashed-Al-Mahfuz, A Haque, A Azad… - IEEE Journal of …, 2021 - ieeexplore.ieee.org
Objective: Chronic kidney disease (CKD) is a major public health concern worldwide. High
costs of late-stage diagnosis and insufficient testing facilities can contribute to high morbidity …