K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data
Advances in recent techniques for scientific data collection in the era of big data allow for the
systematic accumulation of large quantities of data at various data-capturing sites. Similarly …
systematic accumulation of large quantities of data at various data-capturing sites. Similarly …
A review of data fusion techniques
F Castanedo - The scientific world journal, 2013 - Wiley Online Library
The integration of data and knowledge from several sources is known as data fusion. This
paper summarizes the state of the data fusion field and describes the most relevant studies …
paper summarizes the state of the data fusion field and describes the most relevant studies …
Fast density peak clustering for large scale data based on kNN
Abstract Density Peak (DPeak) clustering algorithm is not applicable for large scale data,
due to two quantities, ie, ρ and δ, are both obtained by brute force algorithm with complexity …
due to two quantities, ie, ρ and δ, are both obtained by brute force algorithm with complexity …
Two improved k-means algorithms
K-means algorithm is the most commonly used simple clustering method. For a large
number of high dimensional numerical data, it provides an efficient method for classifying …
number of high dimensional numerical data, it provides an efficient method for classifying …
Elastic machine learning algorithms in amazon sagemaker
There is a large body of research on scalable machine learning (ML). Nevertheless, training
ML models on large, continuously evolving datasets is still a difficult and costly undertaking …
ML models on large, continuously evolving datasets is still a difficult and costly undertaking …
Sage: Self-tuning approximation for graphics engines
Approximate computing, where computation accuracy is traded off for better performance or
higher data throughput, is one solution that can help data processing keep pace with the …
higher data throughput, is one solution that can help data processing keep pace with the …
Approximate k-means++ in sublinear time
The quality of K-Means clustering is extremely sensitive to proper initialization. The classic
remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive …
remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive …
An evolutionary algorithm for clustering data streams with a variable number of clusters
Several algorithms for clustering data streams based on k-Means have been proposed in
the literature. However, most of them assume that the number of clusters, k, is known a priori …
the literature. However, most of them assume that the number of clusters, k, is known a priori …
State-of-the-art on clustering data streams
M Ghesmoune, M Lebbah, H Azzag - Big Data Analytics, 2016 - Springer
Clustering is a key data mining task. This is the problem of partitioning a set of observations
into clusters such that the intra-cluster observations are similar and the inter-cluster …
into clusters such that the intra-cluster observations are similar and the inter-cluster …