SaC/C formulations of the all‐pairs N‐body problem and their performance on SMPs and GPGPUs

A Šinkarovs, SB Scholz, R Bernecky… - Concurrency and …, 2014 - Wiley Online Library
This paper describes our experience in implementing the classical N‐body algorithm in SaC
and analysing the runtime performance achieved on three different machines: a dual …

Rank-Polymorphism for Shape-Guided Blocking

A Šinkarovs, T Koopman, SB Scholz - Proceedings of the 11th ACM …, 2023 - dl.acm.org
Many numerical algorithms on matrices or tensors can be formulated in a blocking style
which improves performance due to better cache locality. In imperative languages, blocking …

Type‐driven data layouts for improved vectorisation

A Šinkarovs, SB Scholz - Concurrency and Computation …, 2016 - Wiley Online Library
Vector instructions of modern CPUs are crucially important for the performance of compute‐
intensive algorithms. Auto‐vectorisation often fails because of an unfortunate choice of data …

Data layout inference for code vectorisation

A Šinkarovs, SB Scholz - 2013 International Conference on …, 2013 - ieeexplore.ieee.org
SIMD instructions of modern CPUs are crucially important for the performance of compute-
intensive algorithms. Auto-vectorisation often fails due to an unfortunate choice of data …

Data Layout Types: a type-based approach to automatic data layout transformations for improved SIMD vectorisation

A Šinkarovs - 2015 - ros.hw.ac.uk
The increasing complexity of modern hardware requires sophisticated programming
techniques for programs to run efficiently. At the same time, increased power of modern …