Portability study of an OpenCL algorithm for automatic target detection in hyperspectral images
In the last decades, the problem of target detection has received considerable attention in
remote sensing applications. When this problem is tackled using hyperspectral images with …
remote sensing applications. When this problem is tackled using hyperspectral images with …
Low precision matrix multiplication for efficient deep learning in NVIDIA Carmel processors
P San Juan, R Rodríguez-Sánchez, FD Igual… - The Journal of …, 2021 - Springer
We introduce a high performance, multi-threaded realization of the gemm kernel for the
ARMv8. 2 architecture that operates with 16-bit (half precision)/queryKindly check and …
ARMv8. 2 architecture that operates with 16-bit (half precision)/queryKindly check and …
Heterogeneous distributed computing based on high‐level abstractions
M Viñas, BB Fraguela, D Andrade… - Concurrency and …, 2018 - Wiley Online Library
The rise of heterogeneous systems has given place to great challenges for users as they
involve new concepts, restrictions, and frameworks. Their exploitation is further complicated …
involve new concepts, restrictions, and frameworks. Their exploitation is further complicated …
High productivity multi-device exploitation with the Heterogeneous Programming Library
Heterogeneous devices require much more work from programmers than traditional CPUs,
particularly when there are several of them, as each one has its own memory space. Multi …
particularly when there are several of them, as each one has its own memory space. Multi …
Towards a high level approach for the programming of heterogeneous clusters
M Vinas, BB Fraguela, D Andrade… - 2016 45th International …, 2016 - ieeexplore.ieee.org
The programming of heterogeneous clusters is inherently complex, as these architectures
require programmers to manage both distributed memory and computational units with a …
require programmers to manage both distributed memory and computational units with a …
Facilitating the development of stencil applications using the Heterogeneous Programming Library
M Viñas, BB Fraguela, D Andrade… - Concurrency and …, 2017 - Wiley Online Library
Stencil computations are very common in scientific codes. Heterogeneous systems achieve
good results solving these problems, but their programming is complex because of the ghost …
good results solving these problems, but their programming is complex because of the ghost …
Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library
The existence of a wide variety of computing devices with very different properties makes
essential the development of software that is not only portable among them, but which also …
essential the development of software that is not only portable among them, but which also …
An automatic optimizer for heterogeneous devices
Codes written in a naive way seldom effectively exploit the computing resources, while
writing optimized codes is usually a complex task that requires certain levels of expertise …
writing optimized codes is usually a complex task that requires certain levels of expertise …
Compiler-Only Code Generation for Performant and Modular Matrix-Multiplication Micro Kernels Using Matrix Engines
B Kuzma - 2021 - era.library.ualberta.ca
Abstract General Matrix-Matrix Multiplication (GEMM) is used widely in many
highperformance application domains. In many cases, these applications repeatedly …
highperformance application domains. In many cases, these applications repeatedly …
Accelerating the HyperLogLog cardinality estimation algorithm
C Bozkus, BB Fraguela - Scientific Programming, 2017 - Wiley Online Library
In recent years, vast amounts of data of different kinds, from pictures and videos from our
cameras to software logs from sensor networks and Internet routers operating day and night …
cameras to software logs from sensor networks and Internet routers operating day and night …