Milepost gcc: Machine learning enabled self-tuning compiler
Tuning compiler optimizations for rapidly evolving hardware makes porting and extending
an optimizing compiler for each new platform extremely challenging. Iterative optimization is …
an optimizing compiler for each new platform extremely challenging. Iterative optimization is …
[PDF][PDF] CHiLL: A framework for composing high-level loop transformations
C Chen, J Chame, M Hall - 2008 - Citeseer
This paper describes a general and robust loop transformation framework that enables
compilers to generate efficient code on complex loop nests. Despite two decades of prior …
compilers to generate efficient code on complex loop nests. Despite two decades of prior …
PolyDL: Polyhedral optimizations for creation of high-performance dl primitives
Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of
DNNs is becoming ubiquitous, including in software for image recognition, speech …
DNNs is becoming ubiquitous, including in software for image recognition, speech …
Towards making autotuning mainstream
Autotuning systems employ empirical techniques to evaluate the suitability of a search
space of possible implementations of a computation. Autotuning has emerged as a critical …
space of possible implementations of a computation. Autotuning has emerged as a critical …
Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology
J Shin, MW Hall, J Chame, C Chen… - … to State-of-the-Art Results, 2010 - Springer
Autotuning technology has emerged recently as a systematic process for evaluating
alternative implementations of a computation to select the best-performing solution for a …
alternative implementations of a computation to select the best-performing solution for a …
Efficiency improvements of iterative numerical algorithms on modern architectures
J Treibig - 2008 - search.proquest.com
For many numerical codes the transport of data from main memory to the registers is
commonly considered to be the main limiting factor to achieve high performance on present …
commonly considered to be the main limiting factor to achieve high performance on present …
Hydra: Automatic algorithm exploration from linear algebra equations
AX Duchâteau, D Padua… - Proceedings of the 2013 …, 2013 - ieeexplore.ieee.org
Hydra accepts an equation written in terms of operations on matrices and automatically
produces highly efficient code to solve these equations. Processing of the equation starts by …
produces highly efficient code to solve these equations. Processing of the equation starts by …
[图书][B] On the computer generation of adaptive numerical libraries
F De Mesmay - 2010 - search.proquest.com
Very fast runtime is crucial in many applications in scientific computing, multimedia
processing, communication, and control. Most of these applications spend the bulk of the …
processing, communication, and control. Most of these applications spend the bulk of the …
Statistical models to accelerate software development by means of iterative compilation
A Kamińska, W Bielecki - Computer Science, 2016 - yadda.icm.edu.pl
Minimization of data-processing time and reduction of software-development time are
important practical problems to be tackled by modern computer science. This paper presents …
important practical problems to be tackled by modern computer science. This paper presents …
[图书][B] Automatic algorithm derivation and exploration in linear algebra for parallelism and locality
AX Duchateau - 2013 - search.proquest.com
Parallelization is one of the major challenges for programmers. But parallelizing existing
code is a hard task that can lead to less than optimal solutions since sequential programs …
code is a hard task that can lead to less than optimal solutions since sequential programs …