Tiramisu: A polyhedral compiler for expressing fast and portable code
R Baghdadi, J Ray, MB Romdhane… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
This paper introduces Tiramisu, a polyhedral framework designed to generate high
performance code for multiple platforms including multicores, GPUs, and distributed …
performance code for multiple platforms including multicores, GPUs, and distributed …
Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
Image processing pipelines combine the challenges of stencil computations and stream
programs. They are composed of large graphs of different stencil stages, as well as complex …
programs. They are composed of large graphs of different stencil stages, as well as complex …
Optoelectronic device simulations based on macroscopic Maxwell–Bloch equations
C Jirauschek, M Riesch… - Advanced Theory and …, 2019 - Wiley Online Library
Due to their intuitiveness, flexibility, and relative numerical efficiency, the macroscopic
Maxwell–Bloch (MB) equations are a widely used semiclassical and semi …
Maxwell–Bloch (MB) equations are a widely used semiclassical and semi …
Polymage: Automatic optimization for image processing pipelines
RT Mullapudi, V Vasista, U Bondhugula - ACM SIGARCH Computer …, 2015 - dl.acm.org
This paper presents the design and implementation of PolyMage, a domain-specific
language and compiler for image processing pipelines. An image processing pipeline can …
language and compiler for image processing pipelines. An image processing pipeline can …
The pochoir stencil compiler
A stencil computation repeatedly updates each point of ad-dimensional grid as a function of
itself and its near neighbors. Parallel cache-efficient stencil algorithms based on" trapezoidal …
itself and its near neighbors. Parallel cache-efficient stencil algorithms based on" trapezoidal …
High-performance code generation for stencil computations on GPU architectures
J Holewinski, LN Pouchet, P Sadayappan - Proceedings of the 26th ACM …, 2012 - dl.acm.org
Stencil computations arise in many scientific computing domains, and often represent time-
critical portions of applications. There is significant interest in offloading these computations …
critical portions of applications. There is significant interest in offloading these computations …
A survey on parallel computing and its applications in data-parallel problems using GPU architectures
CA Navarro, N Hitschfeld-Kahler… - … in Computational Physics, 2014 - cambridge.org
Parallel computing has become an important subject in the field of computer science and
has proven to be critical when researching high performance solutions. The evolution of …
has proven to be critical when researching high performance solutions. The evolution of …
SODA: Stencil with optimized dataflow architecture
Stencil computation is one of the most important kernels in many application domains such
as image processing, solving partial differential equations, and cellular automata. Many of …
as image processing, solving partial differential equations, and cellular automata. Many of …
AKG: automatic kernel generation for neural processing units using polyhedral transformations
Existing tensor compilers have proven their effectiveness in deploying deep neural networks
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …
Tiling stencil computations to maximize parallelism
V Bandishti, I Pananilath… - SC'12: Proceedings of …, 2012 - ieeexplore.ieee.org
Most stencil computations allow tile-wise concurrent start, ie, there always exists a face of
the iteration space and a set of tiling hyperplanes such that all tiles along that face can be …
the iteration space and a set of tiling hyperplanes such that all tiles along that face can be …