A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

Tactical provenance analysis for endpoint detection and response systems

WU Hassan, A Bates, D Marino - 2020 IEEE Symposium on …, 2020 - ieeexplore.ieee.org
Endpoint Detection and Response (EDR) tools provide visibility into sophisticated intrusions
by matching system events against known adversarial behaviors. However, current solutions …

A survey on provenance: What for? What form? What from?

M Herschel, R Diestelkämper, H Ben Lahmar - The VLDB Journal, 2017 - Springer
Provenance refers to any information describing the production process of an end product,
which can be anything from a piece of digital data to a physical object. While this survey …

Big data semantics

P Ceravolo, A Azzini, M Angelini, T Catarci… - Journal on Data …, 2018 - Springer
Big Data technology has discarded traditional data modeling approaches as no longer
applicable to distributed data processing. It is, however, largely recognized that Big Data …

Big data provenance: Challenges and implications for benchmarking

B Glavic - Workshop on Big Data Benchmarks, 2012 - Springer
Data Provenance is information about the origin and creation process of data. Such
information is useful for debugging data and transformations, auditing, evaluating the quality …

The good, the bad, and the differences: Better network diagnostics with differential provenance

A Chen, Y Wu, A Haeberlen, W Zhou… - Proceedings of the 2016 …, 2016 - dl.acm.org
In this paper, we propose a new approach to diagnosing problems in complex distributed
systems. Our approach is based on the insight that many of the trickiest problems are …

Diagnosing missing events in distributed systems with negative provenance

Y Wu, M Zhao, A Haeberlen, W Zhou… - ACM SIGCOMM Computer …, 2014 - dl.acm.org
When debugging a distributed system, it is sometimes necessary to explain the absence of
an event-for instance, why a certain route is not available, or why a certain packet did not …

PrintQueue: performance diagnosis via queue measurement in the data plane

Y Lei, L Yu, V Liu, M Xu - Proceedings of the ACM SIGCOMM 2022 …, 2022 - dl.acm.org
When diagnosing performance anomalies, it is often useful to reason about why a packet
experienced the queuing that it did. To that end, we observe that queuing is both a result of …

Zeno: Diagnosing performance problems with temporal provenance

Y Wu, A Chen, LTX Phan - 16th USENIX Symposium on Networked …, 2019 - usenix.org
When diagnosing a problem in a distributed system, it is sometimes necessary to explain the
timing of an event—for instance, why a response has been delayed, or why the network …

Object-based data flow testing of web applications

CH Liu, DC Kung, P Hsia - Proceedings First Asia-Pacific …, 2000 - ieeexplore.ieee.org
Recently, the extraordinary growth in the World Wide Web has been sweeping through
business and industry. Many companies have developed or integrated their mission-critical …