A high-performance algorithm for identifying frequent items in data streams

D Anderson, P Bevan, K Lang, E Liberty… - Proceedings of the …, 2017 - dl.acm.org
Estimating frequencies of items over data streams is a common building block in streaming
data measurement and analysis. Misra and Gries introduced their seminal algorithm for the …

Parallel space saving on multi‐and many‐core processors

M Cafaro, M Pulimeno, I Epicoco… - … Practice and Experience, 2018 - Wiley Online Library
Given an array of n elements and a value 2≤ k≤ n, a frequent item or k‐majority element is
an element occurring in more than n/k times. The k‐majority problem requires finding all of …

On frequency estimation and detection of frequent items in time faded streams

M Cafaro, I Epicoco, M Pulimeno, G Aloisio - IEEE Access, 2017 - ieeexplore.ieee.org
We deal with the problem of detecting frequent items in a stream under the constraint that
items are weighted, and recent items must be weighted more than older ones. This kind of …

Fast and accurate mining of correlated heavy hitters

I Epicoco, M Cafaro, M Pulimeno - Data Mining and Knowledge Discovery, 2018 - Springer
The problem of mining correlated heavy hitters (CHH) from a two-dimensional data stream
has been introduced recently, and a deterministic algorithm based on the use of the Misra …

CMSS: Sketching based reliable tracking of large network flows

M Cafaro, I Epicoco, M Pulimeno - Future Generation Computer Systems, 2019 - Elsevier
Reliably tracking large network flows in order to determine so-called elephant flows, also
known as heavy hitters or frequent items, is a common data mining task. Indeed, this kind of …

Mining frequent items in unstructured P2P networks

M Cafaro, I Epicoco, M Pulimeno - Future Generation Computer Systems, 2019 - Elsevier
Large scale decentralized systems, such as P2P, sensor or IoT device networks are
becoming increasingly common, and require robust protocols to address the challenges …

On frequency estimation and detection of heavy hitters in data streams

F Ventruto, M Pulimeno, M Cafaro, I Epicoco - Future Internet, 2020 - mdpi.com
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives
sequentially and must be processed without the possibility of being stored. In fact, the …

Data stream fusion for accurate quantile tracking and analysis

M Cafaro, C Melle, I Epicoco, M Pulimeno - Information Fusion, 2023 - Elsevier
UDDSketch is a recent algorithm for accurate tracking of quantiles in data streams, derived
from the DDSketch algorithm. UDDSketch provides accuracy guarantees covering the full …

Parallel mining of time-faded heavy hitters

M Cafaro, M Pulimeno, I Epicoco - Expert Systems with Applications, 2018 - Elsevier
In this paper we present PFDCMSS (Parallel Forward Decay Count–Min Space Saving)
which, to the best of our knowledge, is the world first message–passing parallel algorithm for …

Deterministic, Fast and Accurate Solution of the Heavy Hitters q-Tail Latencies Problem

A Fornaio, I Epicoco, M Pulimeno, M Cafaro - IEEE Access, 2022 - ieeexplore.ieee.org
The heavy hitters-tail latencies problem has been introduced recently. This problem, framed
in the context of data stream monitoring, requires approximating the quantiles of the heavy …