A catalog of stream processing optimizations

M Hirzel, R Soulé, S Schneider, B Gedik… - ACM Computing Surveys …, 2014 - dl.acm.org
Various research communities have independently arrived at stream processing as a
programming model for efficient and parallel computing. These communities include digital …

Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

MI Gordon, W Thies, S Amarasinghe - ACM SIGPLAN Notices, 2006 - dl.acm.org
As multicore architectures enter the mainstream, there is a pressing demand for high-level
programming models that can effectively map to them. Stream programming offers an …

A survey of state management in big data processing systems

QC To, J Soto, V Markl - The VLDB Journal, 2018 - Springer
The concept of state and its applications vary widely across big data processing systems.
This is evident in both the research literature and existing systems, such as Apache Flink …

Language and compiler support for stream programs

WF Thies - 2009 - dspace.mit.edu
Stream programs represent an important class of high-performance computations. Defined
by their regular processing of sequences of data, stream programs appear most commonly …

Paver: Locality graph-based thread block scheduling for gpus

D Tripathy, A Abdolrashidi, LN Bhuyan, L Zhou… - ACM Transactions on …, 2021 - dl.acm.org
The massive parallelism present in GPUs comes at the cost of reduced L1 and L2 cache
sizes per thread, leading to serious cache contention problems such as thrashing. Hence …

11 PFLOP/s simulations of cloud cavitation collapse

D Rossinelli, B Hejazialhosseini… - Proceedings of the …, 2013 - dl.acm.org
We present unprecedented, high throughput simulations of cloud cavitation collapse on 1.6
million cores of Sequoia reaching 55% of its nominal peak performance, corresponding to …

Tutorial: stream processing optimizations

S Schneider, M Hirzel, B Gedik - … of the 7th ACM international conference …, 2013 - dl.acm.org
This tutorial starts with a survey of optimizations for streaming applications. The survey is
organized as a catalog that introduces uniform terminology and a common categorization of …

Locality-aware task management for unstructured parallelism: A quantitative limit study

RM Yoo, CJ Hughes, C Kim, YK Chen… - Proceedings of the …, 2013 - dl.acm.org
As we increase the number of cores on a processor die, the on-chip cache hierarchies that
support these cores are getting larger, deeper, and more complex. As a result, non-uniform …

[PDF][PDF] Compiler techniques for scalable performance of stream programs on multicore architectures

MI Gordon - 2010 - Citeseer
Given the ubiquity of multicore processors, there is an acute need to enable the
development of scalable parallel applications without unduly burdening programmers …

Dynamic expressivity with static optimization for streaming languages

R Soulé, MI Gordon, S Amarasinghe, R Grimm… - Proceedings of the 7th …, 2013 - dl.acm.org
Developers increasingly use streaming languages to write applications that process large
volumes of data with high throughput. Unfortunately, when picking which streaming …