A survey on compiler autotuning using machine learning

AH Ashouri, W Killian, J Cavazos, G Palermo… - ACM Computing …, 2018 - dl.acm.org
Since the mid-1990s, researchers have been trying to use machine-learning-based
approaches to solve a number of different compiler optimization problems. These …

A survey of performance tuning techniques and tools for parallel applications

D Mustafa - IEEE Access, 2022 - ieeexplore.ieee.org
Automatic parallelization of sequential programs combined with auto-tuning is an alternative
to manual parallelization. With wider research directions and the increased number of …

Efficient auto-tuning of parallel programs with interdependent tuning parameters via auto-tuning framework (ATF)

A Rasch, R Schulze, M Steuwer… - ACM Transactions on …, 2021 - dl.acm.org
Auto-tuning is a popular approach to program optimization: it automatically finds good
configurations of a program's so-called tuning parameters whose values are crucial for …

OmniRPC: a Grid RPC system for parallel programming in cluster and Grid environment

M Sato, T Boku, D Takahashi - CCGrid 2003. 3rd IEEE/ACM …, 2003 - ieeexplore.ieee.org
We have designed and implemented a Grid RPC system called OmniRPC, for parallel
programming in cluster and grid environments. While OmniRPC inherits its API from Ninf, the …

Towards fine-grained dynamic tuning of HPC applications on modern multi-core architectures

M Sourouri, EB Raknes, N Reissmann… - Proceedings of the …, 2017 - dl.acm.org
There is a consensus that exascale systems should operate within a power envelope of
20MW. Consequently, energy conservation is still considered as the most crucial constraint if …

Static placement of computation on heterogeneous devices

G Poesia, B Guimarães, F Ferracioli… - Proceedings of the ACM …, 2017 - dl.acm.org
Heterogeneous architectures characterize today hardware ranging from super-computers to
smartphones. However, in spite of this importance, programming such systems is still …

Fast: A fast stencil autotuning framework based on an optimal-solution space model

Y Luo, G Tan, Z Mo, N Sun - Proceedings of the 29th ACM on …, 2015 - dl.acm.org
Stencil computations comprise an important class of kernels in many scientific computing
applications. As the diversity of both architectures and programming models grow …

Improving performance using computational compression through memoization: A case study using a railway power consumption simulator

A Calderón, A García… - … Journal of High …, 2016 - journals.sagepub.com
The objective of data compression is to avoid redundancy in order to reduce the size of the
data to be stored or transmitted. In some scenarios, data compression may help to increase …

Extracting facts from performance tuning history of scientific applications for predicting effective optimization patterns

M Hashimoto, M Terai, T Maeda… - 2015 IEEE/ACM 12th …, 2015 - ieeexplore.ieee.org
To improve performance of large-scale scientific applications, scientists or tuning experts
make various empirical attempts to change compiler options, program parameters or even …

An empirical study of computation-intensive loops for identifying and classifying loop kernels: Full research paper

M Hashimoto, M Terai, T Maeda, K Minami - … of the 8th ACM/SPEC on …, 2017 - dl.acm.org
The process of performance tuning is time consuming and costly even if it is carried out
automatically. It is crucial to learn from the experience of experts. Our long-term goal is to …