[PDF][PDF] Big data clustering techniques based on spark: a literature review

MM Saeed, Z Al Aghbari, M Alsharidah - PeerJ Computer Science, 2020 - peerj.com
A popular unsupervised learning method, known as clustering, is extensively used in data
mining, machine learning and pattern recognition. The procedure involves grouping of …

Machine learning algorithms in Bigdata analysis and its applications: A Review

F Khoshaba, S Kareem, H Awla… - … Congress on Human …, 2022 - ieeexplore.ieee.org
A wide range of disparate variety of heterogeneous and even disparate data sources has
been integrated into the computer science research principles through the assistance of …

Moth-flame optimization-bat optimization: Map-reduce framework for big data clustering using the Moth-flame bat optimization and sparse Fuzzy C-means

V Ravuri, S Vasundra - Big Data, 2020 - liebertpub.com
The technical advancements in big data have become popular and most desirable among
users for storing, processing, and handling huge data sets. However, clustering using these …

Big Data: controlling fraud by using machine learning libraries on Spark

F Karataş, SA Korkmaz - International Journal of Applied …, 2018 - dergipark.org.tr
Continuous changes and the high calculation volume in network data distribution have
made it more difficult to detect abnormal behaviors within and analyze data. For this cause …

A survey of parallel clustering algorithms based on spark

W Xiao, J Hu - Scientific Programming, 2020 - Wiley Online Library
Clustering is one of the most important unsupervised machine learning tasks, which is
widely used in information retrieval, social network analysis, image processing, and other …

Churn prediction using optimized deep learning classifier on huge telecom data

B Garimella, G Prasad, MHMK Prasad - Journal of Ambient Intelligence …, 2023 - Springer
With the increasing number of telecom providers and services, churn prediction gains
tremendous interest in the current decade. The prediction models based on machine …

PERMS: An efficient rescue route planning system in disasters

X Xu, L Zhang, M Trovati, F Palmieri… - Applied Soft …, 2021 - Elsevier
The occurrence of natural and man-made disasters usually leads to significant social and
economic disruption, as well as high numbers of casualties. Such occurrences are difficult to …

Data mining techniques for IoT and big data—A survey

A Shobanadevi, G Maragatham - … International Conference on …, 2017 - ieeexplore.ieee.org
Data Mining is the discovery of “models” of data. Data dredging is a process of derogatory
referring to attempts for extracting information that was not supported by the data. Today …

RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user …

Y Liu, X Du, S Ma - Electronic Commerce Research, 2023 - Springer
The traditional K-means algorithm is very sensitive to the selection of clustering centers and
the calculation of distances, so the algorithm easily converges to a locally optimal solution …

Spark 环境下k means 初始中心点优化研究综述.

行艳妮, 钱育蓉, 南方哲… - Application Research of …, 2020 - search.ebscohost.com
为了能够及时了解Spark 环境下经典聚类算法K means 的最新研究进展, 把握K means
算法当前的研究热点和方向, 针对K means 算法的初始中心点优化研究进行综述 …