SPARTA: Runtime task allocation for energy efficient heterogeneous many-cores

B Donyanavard, T Mück, S Sarma, N Dutt - Proceedings of the Eleventh …, 2016 - dl.acm.org
To meet the performance and energy efficiency demands of emerging complex and variable
workloads, heterogeneous many-core architectures are increasingly being deployed …

Efficient and fair multi-programming in GPUs via effective bandwidth management

H Wang, F Luo, M Ibrahim, O Kayiran… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Managing the thread-level parallelism (TLP) of GPGPU applications by limiting it to a certain
degree is known to be effective in improving the overall performance. However, we find that …

The load slice core microarchitecture

TE Carlson, W Heirman, O Allam, S Kaxiras… - Proceedings of the …, 2015 - dl.acm.org
Driven by the motivation to expose instruction-level parallelism (ILP), microprocessor cores
have evolved from simple, in-order pipelines into complex, superscalar out-of-order designs …

Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale numa machines

W Wang, JW Davidson, ML Soffa - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
Modern NUMA platforms offer large numbers of cores to boost performance through
parallelism and multi-threading. However, because performance scalability is limited by …

Maximizing system utilization via parallelism management for co-located parallel applications

Y Cho, CAC Guzman, B Egger - … of the 27th International Conference on …, 2018 - dl.acm.org
With an increasing number of cores and memory controllers in multiprocessor platforms, co-
location of parallel applications is gaining on importance. Key to achieve good performance …

Adapt burstable containers to variable CPU resources

H Huang, Y Zhao, J Rao, S Wu, H Jin… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
In the age of the cloud-native, container technology, referred as OS-level virtualization, is
increasingly adopted to deploy cloud applications. Compared with virtual machines …

Malthusian locks

D Dice - Proceedings of the Twelfth European Conference on …, 2017 - dl.acm.org
Applications running in modern multithreaded environments are sometimes overthreaded.
The excess threads do not improve performance, and in fact may act to degrade …

Auto-tuning Spark big data workloads on POWER8: Prediction-based dynamic SMT threading

Z Jia, C Xue, G Chen, J Zhan, L Zhang, Y Lin… - Proceedings of the …, 2016 - dl.acm.org
Much research work devotes to tuning big data analytics in modern data centers, since% the
truth that even a small percentage of performance improvement immediately translates to …

Adaptive resource views for containers

H Huang, J Rao, S Wu, H Jin, K Suo, X Wu - Proceedings of the 28th …, 2019 - dl.acm.org
As OS-level virtualization advances, containers have become a viable alternative to virtual
machines in deploying applications in the cloud. Unlike virtual machines, which allow guest …

Design methodology for responsive and rrobust mimo control of heterogeneous multicores

T Mück, B Donyanavard, K Moazzemi… - … on Multi-Scale …, 2018 - ieeexplore.ieee.org
Heterogeneous multicore processors (HMPs) are commonly deployed to meet the
performance and power requirements of emerging workloads. HMPs demand adaptive and …