Data streams: Algorithms and applications

S Muthukrishnan - Foundations and Trends® in Theoretical …, 2005 - nowpublishers.com
In the data stream scenario, input arrives very rapidly and there is limited memory to store
the input. Algorithms have to work with one or few passes over the data, space less than …

Coresets and sketches

JM Phillips - Handbook of discrete and computational geometry, 2017 - taylorfrancis.com
Geometric data summarization has become an essential tool in both geometric
approximation algorithms and where geometry intersects with big data problems. In linear or …

Mergeable summaries

PK Agarwal, G Cormode, Z Huang, JM Phillips… - ACM Transactions on …, 2013 - dl.acm.org
We study the mergeability of data summaries. Informally speaking, mergeability requires
that, given two summaries on two datasets, there is a way to merge the two summaries into a …

Range searching

PK Agarwal - Handbook of discrete and computational geometry, 2017 - taylorfrancis.com
A central problem in computational geometry, range searching arises in many applications,
and a variety of geometric problems can be formulated as range-searching problems. A …

[图书][B] Small summaries for big data

G Cormode, K Yi - 2020 - books.google.com
The massive volume of data generated in modern applications can overwhelm our ability to
conveniently transmit, store, and index it. For many scenarios, building a compact summary …

Coresets in dynamic geometric data streams

G Frahling, C Sohler - Proceedings of the thirty-seventh annual ACM …, 2005 - dl.acm.org
A dynamic geometric data stream consists of a sequence of m insert/delete operations of
points from the discrete space 1,…, Δ d [26]. We develop streaming (1+ ε)-approximation …

Sampling in dynamic data streams and applications

G Frahling, P Indyk, C Sohler - Proceedings of the twenty-first annual …, 2005 - dl.acm.org
A dynamic geometric data stream is a sequence of m Add/Remove operations of points from
a discrete geometric space (1,..., Δ) d [21]. Add (p) inserts a point p from (1,..., Δ) d into the …

Optimal tracking of distributed heavy hitters and quantiles

K Yi, Q Zhang - Proceedings of the twenty-eighth ACM SIGMOD …, 2009 - dl.acm.org
We consider the the problem of tracking heavy hitters and quantiles in the distributed
streaming model. The heavy hitters and quantiles are two important statistics for …

Quality and efficiency for kernel density estimates in large data

Y Zheng, J Jestes, JM Phillips, F Li - Proceedings of the 2013 ACM …, 2013 - dl.acm.org
Kernel density estimates are important for a broad variety of applications. Their construction
has been well-studied, but existing techniques are expensive on massive datasets and/or …

The adversarial robustness of sampling

O Ben-Eliezer, E Yogev - Proceedings of the 39th ACM SIGMOD …, 2020 - dl.acm.org
Random sampling is a fundamental primitive in modern algorithms, statistics, and machine
learning, used as a generic method to obtain a small yet" representative" subset of the data …