An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal variant of Gaussian
elimination and exploits low-rank approximation of the resulting dense frontal matrices. We …
elimination and exploits low-rank approximation of the resulting dense frontal matrices. We …
Achieving high performance on supercomputers with a sequential task-based programming model
The emergence of accelerators as standard computing resources on supercomputers and
the subsequent architectural complexity increase revived the need for high-level parallel …
the subsequent architectural complexity increase revived the need for high-level parallel …
Extreme-scale task-based cholesky factorization toward climate and weather prediction applications
Climate and weather can be predicted statistically via geospatial Maximum Likelihood
Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE …
Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE …
Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems
To face the advent of multicore processors and the ever increasing complexity of hardware
architectures, programming models based on DAG parallelism regained popularity in the …
architectures, programming models based on DAG parallelism regained popularity in the …
Task‐based FMM for heterogeneous architectures
High performance fast multipole method is crucial for the numerical simulation of many
physical problems. In a previous study, we have shown that task‐based fast multipole …
physical problems. In a previous study, we have shown that task‐based fast multipole …
A supernodal all-pairs shortest path algorithm
We show how to exploit graph sparsity in the Floyd-Warshall algorithm for the all-pairs
shortest path (Apsp) problem. Floyd-Warshall is an attractive choice for Apsp on high …
shortest path (Apsp) problem. Floyd-Warshall is an attractive choice for Apsp on high …
Performance analysis of tile low-rank cholesky factorization using parsec instrumentation tools
This paper highlights the necessary development of new instrumentation tools within the
PaRSE task-based runtime system to leverage the performance of low-rank matrix …
PaRSE task-based runtime system to leverage the performance of low-rank matrix …
A visual performance analysis framework for task‐based parallel applications running on hybrid clusters
V Garcia Pinto, L Mello Schnorr… - Concurrency and …, 2018 - Wiley Online Library
Summary Programming paradigms in High‐Performance Computing have been shifting
toward task‐based models that are capable of adapting readily to heterogeneous and …
toward task‐based models that are capable of adapting readily to heterogeneous and …
A framework to exploit data sparsity in tile low-rank cholesky factorization
We present a general framework that couples the PaRSEC runtime system and the HiCMA
numerical library to solve challenging 3D data-sparse problems. Though formally dense …
numerical library to solve challenging 3D data-sparse problems. Though formally dense …