Statistical structure analysis in MRI brain tumor segmentation
X Xuan, Q Liao - … Conference on Image and Graphics (ICIG …, 2007 - ieeexplore.ieee.org
Automated MRI (Magnetic Resonance Imaging) brain tumor segmentation is a difficult task
due to the variance and complexity of tumors. In this paper, a statistical structure analysis …
due to the variance and complexity of tumors. In this paper, a statistical structure analysis …
LaRIS: targeting portability and productivity for lapack codes on extreme heterogeneous systems by using iris
In keeping with the trend of heterogeneity in high-performance computing, hardware
manufacturers and vendors are developing new architectures and associated software …
manufacturers and vendors are developing new architectures and associated software …
Adaptive parallel applications: from shared memory architectures to fog computing (2002–2022)
G Galante, R da Rosa Righi - Cluster Computing, 2022 - Springer
The evolution of parallel architectures points to dynamic environments where the number of
available resources or configurations may vary during the execution of applications. This …
available resources or configurations may vary during the execution of applications. This …
sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library)
In this work we have implemented a novel Linear Algebra Library on top of the task-based
runtime OmpSs-2. We have used some of the most advanced OmpSs-2 features; weak …
runtime OmpSs-2. We have used some of the most advanced OmpSs-2 features; weak …
Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems
BLAS Level 3 operations are essential for scientific computing, but finding the optimal
number of threads for multi-threaded implementations on modern multi-core systems is …
number of threads for multi-threaded implementations on modern multi-core systems is …
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors
R Rodríguez-Sánchez, A Castelló… - … Journal of High …, 2024 - journals.sagepub.com
Malleability is defined as the ability to vary the degree of parallelism at runtime, and is
regarded as a means to improve core occupation on state-of-the-art multicore processors …
regarded as a means to improve core occupation on state-of-the-art multicore processors …
A Machine Learning Approach Towards Runtime Optimisation of Matrix Multiplication
The GEneral Matrix Multiplication (GEMM) is one of the essential algorithms in scientific
computing. Single-thread GEMM implementations are well-optimised with techniques like …
computing. Single-thread GEMM implementations are well-optimised with techniques like …
Towards a malleable tensorflow implementation
The TensorFlow framework was designed since its inception to provide multi-thread
capabilities, extended with hardware accelerator support to leverage the potential of modern …
capabilities, extended with hardware accelerator support to leverage the potential of modern …
Static versus dynamic task scheduling of the lu factorization on ARM big. LITTLE architectures
S Catalán, R Rodríguez-Sánchez… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
We investigate several parallel algorithmic variants of the LU factorization with partial
pivoting (LUpp) that trade off the exploitation of increasing levels of task-parallelism in …
pivoting (LUpp) that trade off the exploitation of increasing levels of task-parallelism in …
Programming parallel dense matrix factorizations with look-ahead and OpenMP
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms,
using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts …
using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts …