Dnnfusion: accelerating deep neural networks execution with advanced operator fusion

W Niu, J Guan, Y Wang, G Agrawal, B Ren - Proceedings of the 42nd …, 2021 - dl.acm.org
Deep Neural Networks (DNNs) have emerged as the core enabler of many major
applications on mobile devices. To achieve high accuracy, DNN models have become …

Polly—performing polyhedral optimizations on a low-level intermediate representation

T Grosser, A Groesslinger, C Lengauer - Parallel Processing Letters, 2012 - World Scientific
The polyhedral model for loop parallelization has proved to be an effective tool for advanced
optimization and automatic parallelization of programs in higher-level languages. Yet, to …

Polymage: Automatic optimization for image processing pipelines

RT Mullapudi, V Vasista, U Bondhugula - ACM SIGARCH Computer …, 2015 - dl.acm.org
This paper presents the design and implementation of PolyMage, a domain-specific
language and compiler for image processing pipelines. An image processing pipeline can …

AKG: automatic kernel generation for neural processing units using polyhedral transformations

J Zhao, B Li, W Nie, Z Geng, R Zhang, X Gao… - Proceedings of the …, 2021 - dl.acm.org
Existing tensor compilers have proven their effectiveness in deploying deep neural networks
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …

Polygeist: Raising C to polyhedral MLIR

WS Moses, L Chelini, R Zhao… - 2021 30th International …, 2021 - ieeexplore.ieee.org
We present Polygeist, a new compilation flow that connects the MLIR compiler infrastructure
to cutting edge polyhedral optimization tools. It consists of a C and C++ frontend capable of …

Tiling stencil computations to maximize parallelism

V Bandishti, I Pananilath… - SC'12: Proceedings of …, 2012 - ieeexplore.ieee.org
Most stencil computations allow tile-wise concurrent start, ie, there always exists a face of
the iteration space and a set of tiling hyperplanes such that all tiles along that face can be …

[PDF][PDF] Polyhedral extraction tool

S Verdoolaege, T Grosser - … International Workshop on …, 2012 - acohen.gitlabpages.inria.fr
We present a new library for extracting a polyhedral model from C source. The library is
based on clang, the LLVM C frontend, and isl, a library for manipulating quasi-affine sets …

Loop transformations: convexity, pruning and optimization

LN Pouchet, U Bondhugula, C Bastoul, A Cohen… - ACM SIGPLAN …, 2011 - dl.acm.org
High-level loop transformations are a key instrument in mapping computational kernels to
effectively exploit the resources in modern processor architectures. Nevertheless, selecting …

The next 700 accelerated layers: From mathematical expressions of network computation graphs to accelerated GPU kernels, automatically

N Vasilache, O Zinenko, T Theodoridis… - ACM Transactions on …, 2019 - dl.acm.org
Deep learning frameworks automate the deployment, distribution, synchronization, memory
allocation, and hardware acceleration of models represented as graphs of computational …

Optimizing the memory hierarchy by compositing automatic transformations on computations and data

J Zhao, P Di - 2020 53rd Annual IEEE/ACM International …, 2020 - ieeexplore.ieee.org
Optimizing compilers exploit the memory hierarchy using loop tiling and fusion, but these
two transformations usually interfere with each other due to the oversight of transformations …