作者
Rania Mkhinini Gahar, Olfa Arfaoui, Minyar Sassi Hidri, Nejib Ben Hadj-Alouane
发表日期
2019/10/7
期刊
IEEE Access
卷号
7
页码范围
151006-151022
出版商
IEEE
简介
The recent explosion of data size in number of records and attributes has triggered the development of a number of Big Data analytics as well as parallel data processing methods and algorithms. At the same time though, it has pushed for usage of data Dimensionality Reduction (DR) procedures. Indeed, more is not always better. Large amounts of data might sometimes produce worse performance in data analytics applications, and this may be caused by the presence of missing data. These latter are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. In this work, we propose a new distributed statistical approach for high-dimensionality reduction of heterogeneous data that is based on the MapReduce paradigm, limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation's method …
引用总数
201920202021202220232024125652
学术搜索中的文章