{µTune}:{Auto-Tuned} Threading for {OLDI} Microservices

A Sriraman, TF Wenisch - … Symposium on Operating Systems Design and …, 2018 - usenix.org
Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to
instead comprise numerous, distributed microservices interacting via Remote Procedure …

Accelerometer: Understanding acceleration opportunities for data center overheads at hyperscale

A Sriraman, A Dhanotia - Proceedings of the Twenty-Fifth International …, 2020 - dl.acm.org
At global user population scale, important microservices in warehouse-scale data centers
can grow to account for an enormous installed base of servers. With the end of Dennard …

Few-to-many: Incremental parallelism for reducing tail latency in interactive services

ME Haque, YH Eom, Y He, S Elnikety, R Bianchini… - ACM Sigplan …, 2015 - dl.acm.org
Interactive services, such as Web search, recommendations, games, and finance, must
respond quickly to satisfy customers. Achieving this goal requires optimizing tail (eg, 99th+ …

Amdahl's law in the context of heterogeneous many‐core systems–a survey

MAN Al‐hayanni, F Xia, A Rafiev… - IET Computers & …, 2020 - Wiley Online Library
For over 50 years, Amdahl's Law has been the hallmark model for reasoning about
performance bounds for homogeneous parallel computing resources. As heterogeneous …

A reconfiguration algorithm for power-aware parallel applications

D De Sensi, M Torquati, M Danelutto - ACM Transactions on Architecture …, 2016 - dl.acm.org
In current computing systems, many applications require guarantees on their maximum
power consumption to not exceed the available power budget. On the other hand, for some …

Machine learning-based approaches for energy-efficiency prediction and scheduling in composite cores architectures

H Sayadi, N Patel, A Sasan… - 2017 IEEE international …, 2017 - ieeexplore.ieee.org
Heterogeneous architectures offer divers computing capabilities. Composite Cores
Architecture (CCA) is a class of dynamic heterogeneous architectures that empowers the …

Adaptive, efficient, parallel execution of parallel programs

S Sridharan, G Gupta, GS Sohi - Proceedings of the 35th ACM SIGPLAN …, 2014 - dl.acm.org
Future multicore processors will be heterogeneous, be increasingly less reliable, and
operate in dynamically changing operating conditions. Such environments will result in a …

Aurora: Seamless optimization of openmp applications

AF Lorenzon, CC De Oliveira… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Efficiently exploiting thread-level parallelism has been challenging for software developers.
As many parallel applications do not scale with the number of cores, the task of rightly …

A runtime and non-intrusive approach to optimize edp by tuning threads and cpu frequency for openmp applications

J Schwarzrock, CC de Oliveira, M Ritt… - … on Parallel and …, 2020 - ieeexplore.ieee.org
Efficiently exploiting thread-level parallelism has been challenging. Many parallel
applications are not sufficiently balanced or CPU-bound to take advantage of the increasing …

Work stealing for interactive services to meet target latency

J Li, K Agrawal, S Elnikety, Y He, ITA Lee, C Lu… - Proceedings of the 21st …, 2016 - dl.acm.org
Interactive web services increasingly drive critical business workloads such as search,
advertising, games, shopping, and finance. Whereas optimizing parallel programs and …