Top-k frequent items and item frequency tracking over sliding windows of any size
Many big data applications today require querying highly dynamic and large-scale data
streams to find the top-k most frequent items in the most recent window of a specified size at …
streams to find the top-k most frequent items in the most recent window of a specified size at …
CMSS: Sketching based reliable tracking of large network flows
Reliably tracking large network flows in order to determine so-called elephant flows, also
known as heavy hitters or frequent items, is a common data mining task. Indeed, this kind of …
known as heavy hitters or frequent items, is a common data mining task. Indeed, this kind of …
Mining frequent items in unstructured P2P networks
Large scale decentralized systems, such as P2P, sensor or IoT device networks are
becoming increasingly common, and require robust protocols to address the challenges …
becoming increasingly common, and require robust protocols to address the challenges …
Efficient, semantics-rich transformation and integration of large datasets
JA Bernabé-Díaz, M del Carmen Legaz-García… - Expert Systems with …, 2019 - Elsevier
The digital age is making more datasets available through the Internet, but their
interoperability is still limited. The Semantic Web should play a fundamental role in …
interoperability is still limited. The Semantic Web should play a fundamental role in …
On frequency estimation and detection of heavy hitters in data streams
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives
sequentially and must be processed without the possibility of being stored. In fact, the …
sequentially and must be processed without the possibility of being stored. In fact, the …
Data stream fusion for accurate quantile tracking and analysis
UDDSketch is a recent algorithm for accurate tracking of quantiles in data streams, derived
from the DDSketch algorithm. UDDSketch provides accuracy guarantees covering the full …
from the DDSketch algorithm. UDDSketch provides accuracy guarantees covering the full …
Deterministic, Fast and Accurate Solution of the Heavy Hitters q-Tail Latencies Problem
The heavy hitters-tail latencies problem has been introduced recently. This problem, framed
in the context of data stream monitoring, requires approximating the quantiles of the heavy …
in the context of data stream monitoring, requires approximating the quantiles of the heavy …
Parallel mining of correlated heavy hitters on distributed and shared-memory architectures
We present parallel algorithms for mining Correlated Heavy Hitters from a two-dimensional
data stream. In particular, we design and implement a message-passing, a shared-memory …
data stream. In particular, we design and implement a message-passing, a shared-memory …
Parallel mining of correlated heavy hitters
We present a message-passing based parallel algorithm for mining Correlated Heavy
Hitters from a two-dimensional data stream. To the best of our knowledge, this is the first …
Hitters from a two-dimensional data stream. To the best of our knowledge, this is the first …
Distributed mining of time-faded heavy hitters
Abstract We present P2PTFHH (Peer-to-Peer Time-Faded Heavy Hitters) which, to the best
of our knowledge, is the first distributed algorithm for mining time-faded heavy hitters on …
of our knowledge, is the first distributed algorithm for mining time-faded heavy hitters on …