Adaptive query processing
A Deshpande, Z Ives, V Raman - Foundations and Trends® …, 2007 - nowpublishers.com
As the data management field has diversified to consider settings in which queries are
increasingly complex, statistics are less available, or data is stored remotely, there has been …
increasingly complex, statistics are less available, or data is stored remotely, there has been …
Presto: SQL on everything
R Sethi, M Traverso, D Sundstrom… - 2019 IEEE 35th …, 2019 - ieeexplore.ieee.org
Presto is an open source distributed query engine that supports much of the SQL analytics
workload at Facebook. Presto is designed to be adaptive, flexible, and extensible. It supports …
workload at Facebook. Presto is designed to be adaptive, flexible, and extensible. It supports …
Dryad: distributed data-parallel programs from sequential building blocks
M Isard, M Budiu, Y Yu, A Birrell, D Fetterly - Proceedings of the 2nd …, 2007 - dl.acm.org
Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel
applications. A Dryad application combines computational" vertices" with communication" …
applications. A Dryad application combines computational" vertices" with communication" …
[PDF][PDF] MapReduce online.
MapReduce is a popular framework for data-intensive distributed computing of batch jobs.
To simplify fault tolerance, many implementations of MapReduce materialize the entire …
To simplify fault tolerance, many implementations of MapReduce materialize the entire …
Streamcloud: An elastic and scalable data streaming system
V Gulisano, R Jimenez-Peris… - … on Parallel and …, 2012 - ieeexplore.ieee.org
Many applications in several domains such as telecommunications, network security, large-
scale sensor networks, require online processing of continuous data flows. They produce …
scale sensor networks, require online processing of continuous data flows. They produce …
Timestream: Reliable stream computation in the cloud
TimeStream is a distributed system designed specifically for low-latency continuous
processing of big streaming data on a large cluster of commodity machines. The unique …
processing of big streaming data on a large cluster of commodity machines. The unique …
Out-of-order processing: a new architecture for high-performance stream systems
J Li, K Tufte, V Shkapenyuk, V Papadimos… - Proceedings of the …, 2008 - dl.acm.org
Many stream-processing systems enforce an order on data streams during query evaluation
to help unblock blocking operators and purge state from stateful operators. Such in-order …
to help unblock blocking operators and purge state from stateful operators. Such in-order …
A survey of state management in big data processing systems
The concept of state and its applications vary widely across big data processing systems.
This is evident in both the research literature and existing systems, such as Apache Flink …
This is evident in both the research literature and existing systems, such as Apache Flink …
{StreamScope}: Continuous Reliable Distributed Processing of Big Data Streams
STREAMSCOPE (or STREAMS) is a reliable distributed stream computation engine that has
been deployed in shared 20,000-server production clusters at Microsoft. STREAMS provides …
been deployed in shared 20,000-server production clusters at Microsoft. STREAMS provides …
Scientific workflow design for mere mortals
Recent years have seen a dramatic increase in research and development of scientific
workflow systems. These systems promise to make scientists more productive by automating …
workflow systems. These systems promise to make scientists more productive by automating …