Zygos: Achieving low tail latency for microsecond-scale networked tasks

G Prekas, M Kogias, E Bugnion - Proceedings of the 26th Symposium on …, 2017 - dl.acm.org
This paper focuses on the efficient scheduling on multicore systems of very fine-grain
networked tasks, which are the typical building block of online data-intensive applications …

Periodic hierarchical load balancing for large supercomputers

G Zheng, A Bhatele, E Meneses… - … International Journal of …, 2011 - journals.sagepub.com
Large parallel machines with hundreds of thousands of processors are becoming more
prevalent. Ensuring good load balance is critical for scaling certain classes of parallel …

Join-idle-queue: A novel load balancing algorithm for dynamically scalable web services

Y Lu, Q Xie, G Kliot, A Geller, JR Larus… - Performance …, 2011 - Elsevier
The prevalence of dynamic-content web services, exemplified by search and online social
networking, has motivated an increasingly wide web-facing front end. Horizontal scaling in …

Taskflow: A lightweight parallel and heterogeneous task graph computing system

TW Huang, DL Lin, CX Lin, Y Lin - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Taskflow aims to streamline the building of parallel and heterogeneous applications using a
lightweight task graph-based approach. Taskflow introduces an expressive task graph …

Enterprise: breadth-first graph traversal on GPUs

H Liu, HH Huang - Proceedings of the International Conference for High …, 2015 - dl.acm.org
The Breadth-First Search (BFS) algorithm serves as the foundation for many graph-
processing applications and analytics workloads. While Graphics Processing Unit (GPU) …

Addressing the straggler problem for iterative convergent parallel ML

A Harlap, H Cui, W Dai, J Wei, GR Ganger… - Proceedings of the …, 2016 - dl.acm.org
FlexRR provides a scalable, efficient solution to the straggler problem for iterative machine
learning (ML). The frequent (eg, per iteration) barriers used in traditional BSP-based …

What's going on? Discovering spatio-temporal dependencies in dynamic scenes

D Kuettel, MD Breitenstein, L Van Gool… - 2010 IEEE computer …, 2010 - ieeexplore.ieee.org
We present two novel methods to automatically learn spatio-temporal dependencies of
moving agents in complex dynamic scenes. They allow to discover temporal rules, such as …

Computing with nearby mobile devices: a work sharing algorithm for mobile edge-clouds

N Fernando, SW Loke… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
As mobile devices evolve to be powerful and pervasive computing tools, their usage also
continues to increase rapidly. However, mobile device users frequently experience …

Scheduling parallel programs by work stealing with private deques

UA Acar, A Charguéraud, M Rainey - Proceedings of the 18th ACM …, 2013 - dl.acm.org
Work stealing has proven to be an effective method for scheduling parallel programs on
multicore computers. To achieve high performance, work stealing distributes tasks between …

Optimizing load balancing and data-locality with data-aware scheduling

K Wang, X Zhou, T Li, D Zhao, M Lang… - … Conference on Big …, 2014 - ieeexplore.ieee.org
Load balancing techniques (eg work stealing) are important to obtain the best performance
for distributed task scheduling systems that have multiple schedulers making scheduling …