A survey on compiler autotuning using machine learning

AH Ashouri, W Killian, J Cavazos, G Palermo… - ACM Computing …, 2018 - dl.acm.org
Since the mid-1990s, researchers have been trying to use machine-learning-based
approaches to solve a number of different compiler optimization problems. These …

A Survey of Design and Optimization for Systolic Array-based DNN Accelerators

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org
In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …

Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions

N Vasilache, O Zinenko, T Theodoridis, P Goyal… - arXiv preprint arXiv …, 2018 - arxiv.org
Deep learning models with convolutional and recurrent networks are now ubiquitous and
analyze massive amounts of audio, image, video, text and graph data, with applications in …

A survey and evaluation of FPGA high-level synthesis tools

R Nane, VM Sima, C Pilato, J Choi… - … on Computer-Aided …, 2015 - ieeexplore.ieee.org
High-level synthesis (HLS) is increasingly popular for the design of high-performance and
energy-efficient heterogeneous systems, shortening time-to-market and addressing today's …

Tiramisu: A polyhedral compiler for expressing fast and portable code

R Baghdadi, J Ray, MB Romdhane… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
This paper introduces Tiramisu, a polyhedral framework designed to generate high
performance code for multiple platforms including multicores, GPUs, and distributed …

AutoSA: A polyhedral compiler for high-performance systolic arrays on FPGA

J Wang, L Guo, J Cong - The 2021 ACM/SIGDA International Symposium …, 2021 - dl.acm.org
While systolic array architectures have the potential to deliver tremendous performance, it is
notoriously challenging to customize an efficient systolic array processor for a target …

Polly—performing polyhedral optimizations on a low-level intermediate representation

T Grosser, A Groesslinger, C Lengauer - Parallel Processing Letters, 2012 - World Scientific
The polyhedral model for loop parallelization has proved to be an effective tool for advanced
optimization and automatic parallelization of programs in higher-level languages. Yet, to …

Polyhedral parallel code generation for CUDA

S Verdoolaege, J Carlos Juega, A Cohen… - ACM Transactions on …, 2013 - dl.acm.org
This article addresses the compilation of a sequential program for parallel execution on a
modern GPU. To this end, we present a novel source-to-source compiler called PPCG …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

PolySA: Polyhedral-based systolic array auto-compilation

J Cong, J Wang - 2018 IEEE/ACM International Conference on …, 2018 - ieeexplore.ieee.org
Automatic systolic array generation has long been an interesting topic due to the need to
reduce the lengthy development cycles of manual designs. Existing automatic systolic array …