Automatic generation of multi-objective polyhedral compiler transformations

J Zhao, B Li, W Nie, Z Geng, R Zhang, X Gao… - Proceedings of the …, 2021 - dl.acm.org

Existing tensor compilers have proven their effectiveness in deploying deep neural networks
on general-purpose hardware like CPU and GPU, but optimizing for neural processing units …

被引用次数：67 相关文章所有 8 个版本

[PDF] arxiv.org

PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler

G Consolaro, Z Zhang, H Razanajato… - 2024 IEEE/ACM …, 2024 - ieeexplore.ieee.org

Polyhedral techniques have been widely used for automatic code optimization in low-level
compilers and higher-level processes. Loop optimization is central to this technique, and …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Report of the workshop on program synthesis for scientific computing

H Finkel, I Laguna - arXiv preprint arXiv:2102.01687, 2021 - arxiv.org

Program synthesis is an active research field in academia, national labs, and industry. Yet,
work directly applicable to scientific computing, while having some impressive successes …

被引用次数：4 相关文章所有 3 个版本

[PDF] acm.org Full View

Source matching and rewriting for MLIR using string-based automata

V Espindola, L Zago, H Yviquel, G Araujo - ACM Transactions on …, 2023 - dl.acm.org

A typical compiler flow relies on a uni-directional sequence of translation/optimization steps
that lower the program abstract representation, making it hard to preserve higher-level …

被引用次数：7 相关文章所有 2 个版本

[PDF] acm.org

Tile size selection of affine programs for GPGPUs using polyhedral cross-compilation

K Abdelaal, M Kong - Proceedings of the ACM International Conference …, 2021 - dl.acm.org

Loop tiling is a key high-level transformation which is known to maximize locality in loop
intensive programs. It has been successfully applied to a number of applications including …

被引用次数：10 相关文章

[PDF] researchgate.net

Towards intelligent compiler optimization

M Kovac, M Brcic, A Krajna… - 2022 45th Jubilee …, 2022 - ieeexplore.ieee.org

The future of computation is massively parallel and heterogeneous with specialized
accelerator devices and instruction sets in both edge-and cluster-computing. However …

被引用次数：4 相关文章所有 5 个版本

[PDF] acm.org

On the impact of affine loop transformations in qubit allocation

M Kong - ACM Transactions on Quantum Computing, 2021 - dl.acm.org

Most quantum compiler transformations and qubit allocation techniques to date are either
peep-hole focused or rely on sliding windows that depend on a number of external …

被引用次数：4 相关文章

[HTML] sciencedirect.com

[HTML][HTML] Abstractions for C++ code optimizations in parallel high-performance applications

J Klepl, A Šmelko, L Rozsypal, M Kruliš - Parallel Computing, 2024 - Elsevier

Many computational problems consider memory throughput a performance bottleneck,
especially in the domain of parallel computing. Software needs to be attuned to hardware …

[PDF] acm.org

Collage: Seamless integration of deep learning backends with automatic placement

B Jeon, S Park, P Liao, S Xu, T Chen, Z Jia - Proceedings of the …, 2022 - dl.acm.org

The strong demand for efficient and performant deployment of Deep Learning (DL)
applications prompts the rapid development of a rich DL ecosystem. To keep up with this fast …

被引用次数：4 相关文章所有 5 个版本

[PDF] acm.org

Efficiently Learning Locality Optimizations by Decomposing Transformation Domains

TR Patabandi, M Hall - Proceedings of the 32nd ACM SIGPLAN …, 2023 - dl.acm.org

Optimizing compilers for efficient machine learning are more important than ever due to the
rising ubiquity of the application domain in numerous facets of life. Predictive model-guided …

被引用次数：1 相关文章所有 2 个版本