Loop transformation recipes for code generation and auto-tuning
M Hall, J Chame, C Chen, J Shin, G Rudy… - … and Compilers for …, 2010 - Springer
In this paper, we describe transformation recipes, which provide a high-level interface to the
code transformation and code generation capability of a compiler. These recipes can be …
code transformation and code generation capability of a compiler. These recipes can be …
Speeding up nek5000 with autotuning and specialization
J Shin, MW Hall, J Chame, C Chen, PF Fischer… - Proceedings of the 24th …, 2010 - dl.acm.org
Autotuning technology has emerged recently as a systematic process for evaluating
alternative implementations of a computation, in order to select the best-performing solution …
alternative implementations of a computation, in order to select the best-performing solution …
Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology
J Shin, MW Hall, J Chame, C Chen… - … to State-of-the-Art Results, 2010 - Springer
Autotuning technology has emerged recently as a systematic process for evaluating
alternative implementations of a computation to select the best-performing solution for a …
alternative implementations of a computation to select the best-performing solution for a …
[图书][B] CUDA-CHiLL: A programming language interface for GPGPU optimizations and code generation
G Rudy - 2010 - search.proquest.com
The advent of the era of cheap and pervasive many-core and multicore parallel systems has
highlighted the disparity of the performance achieved between novice and expert …
highlighted the disparity of the performance achieved between novice and expert …
Improving high-performance sparse libraries using compiler-assisted specialization: A PETSc case study
S Ramalingam, M Hall, C Chen - 2012 IEEE 26th International …, 2012 - ieeexplore.ieee.org
Scientific libraries are written in a general way in anticipation of a variety of use cases that
reduce optimization opportunities. Significant performance gains can be achieved by …
reduce optimization opportunities. Significant performance gains can be achieved by …
Analysis of a sparse hypermatrix Cholesky with fixed-sized blocking
JR Herrero, JJ Navarro - Applicable Algebra in Engineering …, 2007 - Springer
We present the way in which we have constructed an implementation of a sparse Cholesky
factorization based on a hypermatrix data structure. This data structure is a storage scheme …
factorization based on a hypermatrix data structure. This data structure is a storage scheme …
[图书][B] A framework for efficient execution of matrix computations
JR Herrero Zaragoza - 2006 - upcommons.upc.edu
Matrix computations lie at the heart of most scientific computational tasks. The solution of
linear systems of equations is a very frequent operation in many fields in science …
linear systems of equations is a very frequent operation in many fields in science …
Compiler-optimized kernels: An efficient alternative to hand-coded inner kernels
JR Herrero, JJ Navarro - … Conference on Computational Science and Its …, 2006 - Springer
The use of highly optimized inner kernels is of paramount importance for obtaining efficient
numerical algorithms. Often, such kernels are created by hand. In this paper, however, we …
numerical algorithms. Often, such kernels are created by hand. In this paper, however, we …
A simulation study of effect of fish body size on communication performance in a fish farm monitoring environment
Y Taniguchi - 2016 European Modelling Symposium (EMS), 2016 - ieeexplore.ieee.org
In recent years, sensor network technologies have attracted attentions in the field of the
primary industry such as the aquaculture industry. We have proposed a fish farm monitoring …
primary industry such as the aquaculture industry. We have proposed a fish farm monitoring …
Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme
JR Herrero, JJ Navarro - … Conference on Parallel Processing and Applied …, 2005 - Springer
LNCS 3911 - Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix
Scheme Page 1 Adapting Linear Algebra Codes to the Memory Hierarchy Using a …
Scheme Page 1 Adapting Linear Algebra Codes to the Memory Hierarchy Using a …