Improving performance of hypermatrix Cholesky factorization

M Hall, J Chame, C Chen, J Shin, G Rudy… - … and Compilers for …, 2010 - Springer

In this paper, we describe transformation recipes, which provide a high-level interface to the
code transformation and code generation capability of a compiler. These recipes can be …

被引用次数：138 相关文章所有 14 个版本

[PDF] psu.edu

Speeding up nek5000 with autotuning and specialization

J Shin, MW Hall, J Chame, C Chen, PF Fischer… - Proceedings of the 24th …, 2010 - dl.acm.org

Autotuning technology has emerged recently as a systematic process for evaluating
alternative implementations of a computation, in order to select the best-performing solution …

被引用次数：64 相关文章所有 8 个版本

[PDF] anl.gov

Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology

J Shin, MW Hall, J Chame, C Chen… - … to State-of-the-Art Results, 2010 - Springer

Autotuning technology has emerged recently as a systematic process for evaluating
alternative implementations of a computation to select the best-performing solution for a …

被引用次数：29 相关文章所有 13 个版本

[PDF] core.ac.uk

[图书][B] CUDA-CHiLL: A programming language interface for GPGPU optimizations and code generation

G Rudy - 2010 - search.proquest.com

The advent of the era of cheap and pervasive many-core and multicore parallel systems has
highlighted the disparity of the performance achieved between novice and expert …

被引用次数：24 相关文章所有 3 个版本

Improving high-performance sparse libraries using compiler-assisted specialization: A PETSc case study

S Ramalingam, M Hall, C Chen - 2012 IEEE 26th International …, 2012 - ieeexplore.ieee.org

Scientific libraries are written in a general way in anticipation of a variety of use cases that
reduce optimization opportunities. Significant performance gains can be achieved by …

被引用次数：15 相关文章所有 4 个版本

[PDF] researchgate.net

Analysis of a sparse hypermatrix Cholesky with fixed-sized blocking

JR Herrero, JJ Navarro - Applicable Algebra in Engineering …, 2007 - Springer

We present the way in which we have constructed an implementation of a sparse Cholesky
factorization based on a hypermatrix data structure. This data structure is a storage scheme …

被引用次数：20 相关文章所有 8 个版本

[PDF] upc.edu

[图书][B] A framework for efficient execution of matrix computations

JR Herrero Zaragoza - 2006 - upcommons.upc.edu

Matrix computations lie at the heart of most scientific computational tasks. The solution of
linear systems of equations is a very frequent operation in many fields in science …

被引用次数：19 相关文章所有 14 个版本

[PDF] upc.edu

Compiler-optimized kernels: An efficient alternative to hand-coded inner kernels

JR Herrero, JJ Navarro - … Conference on Computational Science and Its …, 2006 - Springer

The use of highly optimized inner kernels is of paramount importance for obtaining efficient
numerical algorithms. Often, such kernels are created by hand. In this paper, however, we …

被引用次数：13 相关文章所有 12 个版本

A simulation study of effect of fish body size on communication performance in a fish farm monitoring environment

Y Taniguchi - 2016 European Modelling Symposium (EMS), 2016 - ieeexplore.ieee.org

In recent years, sensor network technologies have attracted attentions in the field of the
primary industry such as the aquaculture industry. We have proposed a fish farm monitoring …

被引用次数：4 相关文章所有 4 个版本

[PDF] academia.edu

Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme

JR Herrero, JJ Navarro - … Conference on Parallel Processing and Applied …, 2005 - Springer

LNCS 3911 - Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix
Scheme Page 1 Adapting Linear Algebra Codes to the Memory Hierarchy Using a …

被引用次数：6 相关文章所有 15 个版本