A survey of machine learning for computer architecture and systems

N Wu, Y Xie - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

Learning performance-improving code edits

A Shypula, A Madaan, Y Zeng, U Alon… - arXiv preprint arXiv …, 2023 - arxiv.org
With the waning of Moore's law, optimizing program performance has become a major focus
of software research. However, high-level optimizations such as API and algorithm changes …

Programl: A graph-based program representation for data flow analysis and compiler optimizations

C Cummins, ZV Fisches, T Ben-Nun… - International …, 2021 - proceedings.mlr.press
Abstract Machine learning (ML) is increasingly seen as a viable approach for building
compiler optimization heuristics, but many ML methods cannot replicate even the simplest of …

IR2VEC LLVM IR Based Scalable Program Embeddings

S VenkataKeerthy, R Aggarwal, S Jain… - ACM Transactions on …, 2020 - dl.acm.org
We propose IR2Vec, a Concise and Scalable encoding infrastructure to represent programs
as a distributed embedding in continuous space. This distributed embedding is obtained by …

Tenset: A large-scale program performance dataset for learned tensor compilers

L Zheng, R Liu, J Shao, T Chen… - Thirty-fifth Conference …, 2021 - openreview.net
Search-based tensor compilers can greatly accelerate the execution of machine learning
models by generating high-performance tensor programs, such as matrix multiplications and …

Evaluation of compilers' capability of automatic vectorization based on source code analysis

JG Feng, YP He, QM Tao - Scientific Programming, 2021 - Wiley Online Library
Automatic vectorization is an important technique for compilers to improve the parallelism of
programs. With the widespread usage of SIMD (Single Instruction Multiple Data) extensions …

Efficient Listing with Set Intersection Speedup

Z Yuan, Y Peng, P Cheng, L Han, X Lin… - 2022 IEEE 38th …, 2022 - ieeexplore.ieee.org
Listing all k-cliques is a fundamental problem in graph mining, with applications in finance,
biology, and social network analysis. However, owing to the exponential growth of the …

Deepcm: Deep neural networks to improve accuracy prediction of database cost models

A Ouared, A Chadli, MA Daoud - … and Computation: Practice …, 2022 - Wiley Online Library
A major challenge for many database management tasks including admission control, query
scheduling, progress monitoring and self‐driving data storage systems is to enhance …

Core placement optimization for multi-chip many-core neural network systems with reinforcement learning

N Wu, L Deng, G Li, Y Xie - ACM Transactions on Design Automation of …, 2020 - dl.acm.org
Multi-chip many-core neural network systems are capable of providing high parallelism
benefited from decentralized execution, and they can be scaled to very large systems with …

Performance optimization using multimodal modeling and heterogeneous gnn

A Dutta, J Alcaraz, A TehraniJamsaz, E Cesar… - Proceedings of the …, 2023 - dl.acm.org
Growing heterogeneity and configurability in HPC architectures has made auto-tuning
applications and runtime parameters on these systems very complex. Users are presented …