Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

[HTML][HTML] Medical image segmentation on GPUs–A comprehensive review

E Smistad, TL Falch, M Bozorgi, AC Elster… - Medical image …, 2015 - Elsevier
Segmentation of anatomical structures, from modalities like computed tomography (CT),
magnetic resonance imaging (MRI) and ultrasound, is a key enabling technology for medical …

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

J Ragan-Kelley, C Barnes, A Adams, S Paris… - Acm Sigplan …, 2013 - dl.acm.org
Image processing pipelines combine the challenges of stencil computations and stream
programs. They are composed of large graphs of different stencil stages, as well as complex …

Polymage: Automatic optimization for image processing pipelines

RT Mullapudi, V Vasista, U Bondhugula - ACM SIGARCH Computer …, 2015 - dl.acm.org
This paper presents the design and implementation of PolyMage, a domain-specific
language and compiler for image processing pipelines. An image processing pipeline can …

A heuristic clustering-based task deployment approach for load balancing using Bayes theorem in cloud environment

J Zhao, K Yang, X Wei, Y Ding, L Hu… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
Aiming at the current problems that most physical hosts in the cloud data center are so
overloaded that it makes the whole cloud data center'load imbalanced and that existing load …

SODA: Stencil with optimized dataflow architecture

Y Chi, J Cong, P Wei, P Zhou - 2018 IEEE/ACM International …, 2018 - ieeexplore.ieee.org
Stencil computation is one of the most important kernels in many application domains such
as image processing, solving partial differential equations, and cellular automata. Many of …

A study of the fundamental performance characteristics of GPUs and CPUs for database analytics

A Shanbhag, S Madden, X Yu - Proceedings of the 2020 ACM SIGMOD …, 2020 - dl.acm.org
There has been significant amount of excitement and recent work on GPU-based database
systems. Previous work has claimed that these systems can perform orders of magnitude …

A stencil compiler for short-vector simd architectures

T Henretty, R Veras, F Franchetti, LN Pouchet… - Proceedings of the 27th …, 2013 - dl.acm.org
Stencil computations are an integral component of applications in a number of scientific
computing domains. Short-vector SIMD instruction sets are ubiquitous on modern …

Hybrid hexagonal/classical tiling for GPUs

T Grosser, A Cohen, J Holewinski… - Proceedings of Annual …, 2014 - dl.acm.org
Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical
hyper-rectangular tiles cannot be used due to the combination of backward and forward …

Casper: Accelerating stencil computations using near-cache processing

A Denzler, GF Oliveira, N Hajinazar, R Bera… - IEEE …, 2023 - ieeexplore.ieee.org
Stencil computations are commonly used in a wide variety of scientific applications, ranging
from large-scale weather prediction to solving partial differential equations. Stencil …