Graphene: An ir for optimized tensor computations on gpus

B Hagedorn, B Fan, H Chen, C Cecka… - Proceedings of the 28th …, 2023 - dl.acm.org
Modern GPUs accelerate computations and data movements of multi-dimensional tensors in
hardware. However, expressing optimized tensor computations in software is extremely …

A domain-extensible compiler with controllable automation of optimisations

T Koehler - arXiv preprint arXiv:2212.12035, 2022 - arxiv.org
In high performance domains like image processing, physics simulation or machine
learning, program performance is critical. Programmers called performance engineers are …