Big data systems: A software engineering perspective
A Davoudian, M Liu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Big Data Systems (BDSs) are an emerging class of scalable software technologies whereby
massive amounts of heterogeneous data are gathered from multiple sources, managed …
massive amounts of heterogeneous data are gathered from multiple sources, managed …
Recent advancements in event processing
M Dayarathna, S Perera - ACM Computing Surveys (CSUR), 2018 - dl.acm.org
Event processing (EP) is a data processing technology that conducts online processing of
event information. In this survey, we summarize the latest cutting-edge work done on EP …
event information. In this survey, we summarize the latest cutting-edge work done on EP …
Samza: stateful scalable stream processing at LinkedIn
SA Noghabi, K Paramasivam, Y Pan… - Proceedings of the …, 2017 - dl.acm.org
Distributed stream processing systems need to support stateful processing, recover quickly
from failures to resume such processing, and reprocess an entire data stream quickly. We …
from failures to resume such processing, and reprocess an entire data stream quickly. We …
What's really new with NewSQL?
A Pavlo, M Aslett - ACM Sigmod Record, 2016 - dl.acm.org
A new class of database management systems (DBMSs) called NewSQL tout their ability to
scale modern on-line transaction processing (OLTP) workloads in a way that is not possible …
scale modern on-line transaction processing (OLTP) workloads in a way that is not possible …
Realtime data processing at facebook
GJ Chen, JL Wiener, S Iyer, A Jaiswal, R Lei… - Proceedings of the …, 2016 - dl.acm.org
Realtime data processing powers many use cases at Facebook, including realtime reporting
of the aggregated, anonymized voice of Facebook users, analytics for mobile applications …
of the aggregated, anonymized voice of Facebook users, analytics for mobile applications …
Consistency and completeness: Rethinking distributed stream processing in apache kafka
G Wang, L Chen, A Dikshit, J Gustafson… - Proceedings of the …, 2021 - dl.acm.org
An increasingly important system requirement for distributed stream processing applications
is to provide strong correctness guarantees under unexpected failures and out-of-order data …
is to provide strong correctness guarantees under unexpected failures and out-of-order data …
A survey on the evolution of stream processing systems
Stream processing has been an active research field for more than 20 years, but it is now
witnessing its prime time due to recent successful efforts by the research community and …
witnessing its prime time due to recent successful efforts by the research community and …
[PDF][PDF] Data Ingestion for the Connected World.
In this paper, we argue that in many “Big Data” applications, getting data into the system
correctly and at scale via traditional ETL (Extract, Transform, and Load) processes is a …
correctly and at scale via traditional ETL (Extract, Transform, and Load) processes is a …
Beyond analytics: The evolution of stream processing systems
Stream processing has been an active research field for more than 20 years, but it is now
witnessing its prime time due to recent successful efforts by the research community and …
witnessing its prime time due to recent successful efforts by the research community and …
A survey of state management in big data processing systems
The concept of state and its applications vary widely across big data processing systems.
This is evident in both the research literature and existing systems, such as Apache Flink …
This is evident in both the research literature and existing systems, such as Apache Flink …