Dnnfusion: accelerating deep neural networks execution with advanced operator fusion
Deep Neural Networks (DNNs) have emerged as the core enabler of many major
applications on mobile devices. To achieve high accuracy, DNN models have become …
applications on mobile devices. To achieve high accuracy, DNN models have become …
Polly—performing polyhedral optimizations on a low-level intermediate representation
T Grosser, A Groesslinger, C Lengauer - Parallel Processing Letters, 2012 - World Scientific
The polyhedral model for loop parallelization has proved to be an effective tool for advanced
optimization and automatic parallelization of programs in higher-level languages. Yet, to …
optimization and automatic parallelization of programs in higher-level languages. Yet, to …
Polymage: Automatic optimization for image processing pipelines
RT Mullapudi, V Vasista, U Bondhugula - ACM SIGARCH Computer …, 2015 - dl.acm.org
This paper presents the design and implementation of PolyMage, a domain-specific
language and compiler for image processing pipelines. An image processing pipeline can …
language and compiler for image processing pipelines. An image processing pipeline can …
AKG: automatic kernel generation for neural processing units using polyhedral transformations
Existing tensor compilers have proven their effectiveness in deploying deep neural networks
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …
Polygeist: Raising C to polyhedral MLIR
We present Polygeist, a new compilation flow that connects the MLIR compiler infrastructure
to cutting edge polyhedral optimization tools. It consists of a C and C++ frontend capable of …
to cutting edge polyhedral optimization tools. It consists of a C and C++ frontend capable of …
Tiling stencil computations to maximize parallelism
V Bandishti, I Pananilath… - SC'12: Proceedings of …, 2012 - ieeexplore.ieee.org
Most stencil computations allow tile-wise concurrent start, ie, there always exists a face of
the iteration space and a set of tiling hyperplanes such that all tiles along that face can be …
the iteration space and a set of tiling hyperplanes such that all tiles along that face can be …
[PDF][PDF] Polyhedral extraction tool
S Verdoolaege, T Grosser - … International Workshop on …, 2012 - acohen.gitlabpages.inria.fr
We present a new library for extracting a polyhedral model from C source. The library is
based on clang, the LLVM C frontend, and isl, a library for manipulating quasi-affine sets …
based on clang, the LLVM C frontend, and isl, a library for manipulating quasi-affine sets …
Loop transformations: convexity, pruning and optimization
High-level loop transformations are a key instrument in mapping computational kernels to
effectively exploit the resources in modern processor architectures. Nevertheless, selecting …
effectively exploit the resources in modern processor architectures. Nevertheless, selecting …
The next 700 accelerated layers: From mathematical expressions of network computation graphs to accelerated GPU kernels, automatically
Deep learning frameworks automate the deployment, distribution, synchronization, memory
allocation, and hardware acceleration of models represented as graphs of computational …
allocation, and hardware acceleration of models represented as graphs of computational …
Optimizing the memory hierarchy by compositing automatic transformations on computations and data
Optimizing compilers exploit the memory hierarchy using loop tiling and fusion, but these
two transformations usually interfere with each other due to the oversight of transformations …
two transformations usually interfere with each other due to the oversight of transformations …