A survey of machine learning for computer architecture and systems

N Wu, Y Xie - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

Vectorization for digital signal processors via equality saturation

A VanHattum, R Nigam, VT Lee, J Bornholt… - Proceedings of the 26th …, 2021 - dl.acm.org
Applications targeting digital signal processors (DSPs) benefit from fast implementations of
small linear algebra kernels. While existing auto-vectorizing compilers are effective at …

VeGen: a vectorizer generator for SIMD and beyond

Y Chen, C Mendis, M Carbin… - Proceedings of the 26th …, 2021 - dl.acm.org
Vector instructions are ubiquitous in modern processors. Traditional compiler auto-
vectorization techniques have focused on targeting single instruction multiple data (SIMD) …

Difftune: Optimizing cpu simulator parameters with learned differentiable surrogates

A Renda, Y Chen, C Mendis… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
CPU simulators are useful tools for modeling CPU execution behavior. However, they suffer
from inaccuracies due to the cost and complexity of setting their fine-grained parameters …

uiCA: Accurate throughput prediction of basic blocks on recent Intel microarchitectures

A Abel, J Reineke - Proceedings of the 36th ACM International …, 2022 - dl.acm.org
Performance models that statically predict the steady-state throughput of basic blocks on
particular microarchitectures, such as IACA, Ithemal, llvm-mca, OSACA, or CQA, can guide …

Coyote: A compiler for vectorizing encrypted arithmetic circuits

R Malik, K Sheth, M Kulkarni - Proceedings of the 28th ACM International …, 2023 - dl.acm.org
Fully Homomorphic Encryption (FHE) is a scheme that allows a computational circuit to
operate on encrypted data and produce a result that, when decrypted, yields the result of the …

Evaluation of compilers' capability of automatic vectorization based on source code analysis

JG Feng, YP He, QM Tao - Scientific Programming, 2021 - Wiley Online Library
Automatic vectorization is an important technique for compilers to improve the parallelism of
programs. With the widespread usage of SIMD (Single Instruction Multiple Data) extensions …

Compiler auto-vectorization with imitation learning

C Mendis, C Yang, Y Pu… - Advances in Neural …, 2019 - proceedings.neurips.cc
Modern microprocessors are equipped with single instruction multiple data (SIMD) or vector
instruction sets which allow compilers to exploit fine-grained data level parallelism. To …

A tensor compiler with automatic data packing for simple and efficient fully homomorphic encryption

A Krastev, N Samardzic, S Langowski… - Proceedings of the …, 2024 - dl.acm.org
Fully Homomorphic Encryption (FHE) enables computing on encrypted data, letting clients
securely offload computation to untrusted servers. While enticing, FHE has two key …

Facile: Fast, accurate, and interpretable basic-block throughput prediction

A Abel, S Sharma, J Reineke - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Basic-block throughput models such as uiCA, IACA, GRANITE, Ithemal, llvm-mca, OSACA,
or CQA guide optimizing compilers and help performance engineers identify and eliminate …