[HTML][HTML] A survey of tax risk detection using data mining techniques
Tax risk behavior causes serious loss of fiscal revenue, damages the country's public
infrastructure, and disturbs the market economic order of fair competition. In recent years, tax …
infrastructure, and disturbs the market economic order of fair competition. In recent years, tax …
Fast noise removal for k-means clustering
This paper considers k-means clustering in the presence of noise. It is known that k-means
clustering is highly sensitive to noise, and thus noise should be removed to obtain a quality …
clustering is highly sensitive to noise, and thus noise should be removed to obtain a quality …
A weighted k-member clustering algorithm for k-anonymization
As a representative model for privacy preserving data publishing, K-anonymity has raised a
considerable number of questions for researchers over the past few decades. Among them …
considerable number of questions for researchers over the past few decades. Among them …
Fast algorithms for distributed k-clustering with outliers
In this paper, we study the $ k $-clustering problems with outliers in distributed setting. The
current best results for the distributed $ k $-center problem with outliers have quadratic local …
current best results for the distributed $ k $-center problem with outliers have quadratic local …
Greedy Strategy Works for -Center Clustering with Outliers and Coreset Construction
We study the problem of $ k $-center clustering with outliers in arbitrary metrics and
Euclidean space. Though a number of methods have been developed in the past decades, it …
Euclidean space. Though a number of methods have been developed in the past decades, it …
Privacy preserving dynamic data release against synonymous linkage based on microaggregation
The rapid development of the mobile Internet coupled with the widespread use of intelligent
terminals have intensified the digitization of personal information and accelerated the …
terminals have intensified the digitization of personal information and accelerated the …
[HTML][HTML] MapReduce algorithms for robust center-based clustering in doubling metrics
E Dandolo, A Mazzetto, A Pietracaprina… - Journal of Parallel and …, 2024 - Elsevier
Clustering is a pivotal primitive for unsupervised learning and data analysis. A popular
variant is the (k, ℓ)-clustering problem, where, given a pointset P from a metric space, one …
variant is the (k, ℓ)-clustering problem, where, given a pointset P from a metric space, one …
Federated matrix factorization: Algorithm design and application to data clustering
Recent demands on data privacy have called for federated learning (FL) as a new
distributed learning paradigm in massive and heterogeneous networks. Although many FL …
distributed learning paradigm in massive and heterogeneous networks. Although many FL …
An Improved Approximation Algorithm for the k-Means Problem with Penalties
The clustering problem has been paid lots of attention in various fields of compute science.
However, in many applications, the existence of noisy data poses a big challenge for the …
However, in many applications, the existence of noisy data poses a big challenge for the …
A practical algorithm for distributed clustering and outlier detection
J Chen, E Sadeqi Azer… - Advances in Neural …, 2018 - proceedings.neurips.cc
We study the classic k-means/median clustering, which are fundamental problems in
unsupervised learning, in the setting where data are partitioned across multiple sites, and …
unsupervised learning, in the setting where data are partitioned across multiple sites, and …