Approximate clustering via core-sets

R Guerraoui, N Gupta, R Pinot - ACM Computing Surveys, 2024 - dl.acm.org

The problem of Byzantine resilience in distributed machine learning, aka Byzantine machine
learning, consists of designing distributed algorithms that can train an accurate model …

被引用次数：25 相关文章所有 3 个版本

[PDF] ieee.org

A survey on quantum channel capacities

L Gyongyosi, S Imre, HV Nguyen - … Communications Surveys & …, 2018 - ieeexplore.ieee.org

Quantum information processing exploits the quantum nature of information. It offers
fundamentally new solutions in the field of computer science and extends the possibilities to …

被引用次数：317 相关文章所有 8 个版本

[PDF] inesctec.pt

Data stream clustering: A survey

JA Silva, ER Faria, RC Barros, ER Hruschka… - ACM Computing …, 2013 - dl.acm.org

Data stream mining is an active research area that has recently emerged to discover
knowledge from large amounts of continuously generated data. In this context, several data …

被引用次数：709 相关文章所有 14 个版本

[PDF] siam.org

Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering

D Feldman, M Schmidt, C Sohler - SIAM Journal on Computing, 2020 - SIAM

We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …

被引用次数：658 相关文章所有 13 个版本

[PDF] arxiv.org

Random forests for big data

R Genuer, JM Poggi, C Tuleau-Malot… - Big Data Research, 2017 - Elsevier

Big Data is one of the major challenges of statistical science and has numerous
consequences from algorithmic and theoretical viewpoints. Big Data always involve massive …

被引用次数：378 相关文章所有 35 个版本

[PDF] jmlr.org

[PDF][PDF] Core vector machines: Fast SVM training on very large data sets.

IW Tsang, JT Kwok, PM Cheung, N Cristianini - Journal of Machine …, 2005 - jmlr.org

Standard SVM training has O (m3) time and O (m2) space complexities, where m is the
training set size. It is thus computationally infeasible on very large data sets. By observing …

被引用次数：1428 相关文章所有 18 个版本

[PDF] arxiv.org

Submodularity in machine learning and artificial intelligence

J Bilmes - arXiv preprint arXiv:2202.00132, 2022 - arxiv.org

In this manuscript, we offer a gentle review of submodularity and supermodularity and their
properties. We offer a plethora of submodular definitions; a full description of a number of …

被引用次数：66 相关文章所有 2 个版本

[PDF] arxiv.org

On coresets for k-means and k-median clustering

S Har-Peled, S Mazumdar - Proceedings of the thirty-sixth annual ACM …, 2004 - dl.acm.org

In this paper, we show the existence of small coresets for the problems of computing k-
median and k-means clustering for points in low dimension. In other words, we show that …

被引用次数：829 相关文章所有 20 个版本

[PDF] psu.edu

Fast approximate spectral clustering

D Yan, L Huang, MI Jordan - Proceedings of the 15th ACM SIGKDD …, 2009 - dl.acm.org

Spectral clustering refers to a flexible class of clustering procedures that can produce high-
quality clusterings on small data sets but which has limited applicability to large-scale …

被引用次数：669 相关文章所有 20 个版本

[PDF] siam.org

Streamkm++ a clustering algorithm for data streams

MR Ackermann, M Märtens, C Raupach… - Journal of Experimental …, 2012 - dl.acm.org

We develop a new k-means clustering algorithm for data streams of points from a Euclidean
space. We call this algorithm StreamKM++. Our algorithm computes a small weighted …

被引用次数：516 相关文章所有 12 个版本