Top-k frequent items and item frequency tracking over sliding windows of any size

C Song, X Liu, T Ge, Y Ge - Information Sciences, 2019 - Elsevier
Many big data applications today require querying highly dynamic and large-scale data
streams to find the top-k most frequent items in the most recent window of a specified size at …

CMSS: Sketching based reliable tracking of large network flows

M Cafaro, I Epicoco, M Pulimeno - Future Generation Computer Systems, 2019 - Elsevier
Reliably tracking large network flows in order to determine so-called elephant flows, also
known as heavy hitters or frequent items, is a common data mining task. Indeed, this kind of …

Mining frequent items in unstructured P2P networks

M Cafaro, I Epicoco, M Pulimeno - Future Generation Computer Systems, 2019 - Elsevier
Large scale decentralized systems, such as P2P, sensor or IoT device networks are
becoming increasingly common, and require robust protocols to address the challenges …

Efficient, semantics-rich transformation and integration of large datasets

JA Bernabé-Díaz, M del Carmen Legaz-García… - Expert Systems with …, 2019 - Elsevier
The digital age is making more datasets available through the Internet, but their
interoperability is still limited. The Semantic Web should play a fundamental role in …

On frequency estimation and detection of heavy hitters in data streams

F Ventruto, M Pulimeno, M Cafaro, I Epicoco - Future Internet, 2020 - mdpi.com
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives
sequentially and must be processed without the possibility of being stored. In fact, the …

Data stream fusion for accurate quantile tracking and analysis

M Cafaro, C Melle, I Epicoco, M Pulimeno - Information Fusion, 2023 - Elsevier
UDDSketch is a recent algorithm for accurate tracking of quantiles in data streams, derived
from the DDSketch algorithm. UDDSketch provides accuracy guarantees covering the full …

Deterministic, Fast and Accurate Solution of the Heavy Hitters q-Tail Latencies Problem

A Fornaio, I Epicoco, M Pulimeno, M Cafaro - IEEE Access, 2022 - ieeexplore.ieee.org
The heavy hitters-tail latencies problem has been introduced recently. This problem, framed
in the context of data stream monitoring, requires approximating the quantiles of the heavy …

Parallel mining of correlated heavy hitters on distributed and shared-memory architectures

M Pulimeno, I Epicoco, M Cafaro… - … conference on big …, 2018 - ieeexplore.ieee.org
We present parallel algorithms for mining Correlated Heavy Hitters from a two-dimensional
data stream. In particular, we design and implement a message-passing, a shared-memory …

Parallel mining of correlated heavy hitters

M Pulimeno, I Epicoco, M Cafaro, C Melle… - … Science and Its …, 2018 - Springer
We present a message-passing based parallel algorithm for mining Correlated Heavy
Hitters from a two-dimensional data stream. To the best of our knowledge, this is the first …

Distributed mining of time-faded heavy hitters

M Pulimeno, I Epicoco, M Cafaro - Information Sciences, 2021 - Elsevier
Abstract We present P2PTFHH (Peer-to-Peer Time-Faded Heavy Hitters) which, to the best
of our knowledge, is the first distributed algorithm for mining time-faded heavy hitters on …