Tpugraphs: A performance prediction dataset on large tensor computational graphs
M Phothilimthana, S Abu-El-Haija… - Advances in …, 2024 - proceedings.neurips.cc
Precise hardware performance models play a crucial role in code optimizations. They can
assist compilers in making heuristic decisions or aid autotuners in identifying the optimal …
assist compilers in making heuristic decisions or aid autotuners in identifying the optimal …
WACO: learning workload-aware co-optimization of the format and schedule of a sparse tensor program
In this paper, we present WACO, a novel method of co-optimizing the format and the
schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this …
schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this …
Tenset: A large-scale program performance dataset for learned tensor compilers
Search-based tensor compilers can greatly accelerate the execution of machine learning
models by generating high-performance tensor programs, such as matrix multiplications and …
models by generating high-performance tensor programs, such as matrix multiplications and …
A flexible approach to autotuning multi-pass machine learning compilers
Search-based techniques have been demonstrated effective in solving complex optimization
problems that arise in domain-specific compilers for machine learning (ML). Unfortunately …
problems that arise in domain-specific compilers for machine learning (ML). Unfortunately …
Supersonic: Learning to generate source code optimizations in C/C++
Software optimization refines programs for resource efficiency while preserving functionality.
Traditionally, it is a process done by developers and compilers. This paper introduces a third …
Traditionally, it is a process done by developers and compilers. This paper introduces a third …
Tensor program optimization with probabilistic programs
Automatic optimization for tensor programs becomes increasingly important as we deploy
deep learning in various environments, and efficient optimization relies on a rich search …
deep learning in various environments, and efficient optimization relies on a rich search …
Tlp: A deep learning-based cost model for tensor program tuning
Tensor program tuning is a non-convex objective optimization problem, to which search-
based approaches have proven to be effective. At the core of the search-based approaches …
based approaches have proven to be effective. At the core of the search-based approaches …
One-shot tuner for deep learning compilers
Auto-tuning DL compilers are gaining ground as an optimizing back-end for DL frameworks.
While existing work can generate deep learning models that exceed the performance of …
While existing work can generate deep learning models that exceed the performance of …
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
G Consolaro, Z Zhang, H Razanajato… - 2024 IEEE/ACM …, 2024 - ieeexplore.ieee.org
Polyhedral techniques have been widely used for automatic code optimization in low-level
compilers and higher-level processes. Loop optimization is central to this technique, and …
compilers and higher-level processes. Loop optimization is central to this technique, and …
Transfer-tuning: Reusing auto-schedules for efficient tensor program code generation
Auto-scheduling for tensor programs is a process where a search algorithm automatically
explores candidate schedules (program transformations) for a given program on a target …
explores candidate schedules (program transformations) for a given program on a target …