An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling

P Ghysels, XS Li, FH Rouet, S Williams… - SIAM Journal on Scientific …, 2016 - SIAM
We present a sparse linear system solver that is based on a multifrontal variant of Gaussian
elimination and exploits low-rank approximation of the resulting dense frontal matrices. We …

Achieving high performance on supercomputers with a sequential task-based programming model

E Agullo, O Aumage, M Faverge… - … on Parallel and …, 2017 - ieeexplore.ieee.org
The emergence of accelerators as standard computing resources on supercomputers and
the subsequent architectural complexity increase revived the need for high-level parallel …

Prediction models for network-linked data

T Li, E Levina, J Zhu - 2019 - projecteuclid.org
Supplement to “Prediction models for network-linked data”. We provide the proof of
theoretical properties, computational complexity, additional simulation examples under …

Extreme-scale task-based cholesky factorization toward climate and weather prediction applications

Q Cao, Y Pei, K Akbudak, A Mikhalev… - Proceedings of the …, 2020 - dl.acm.org
Climate and weather can be predicted statistically via geospatial Maximum Likelihood
Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE …

Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems

E Agullo, A Buttari, A Guermouche… - Acm transactions on …, 2016 - dl.acm.org
To face the advent of multicore processors and the ever increasing complexity of hardware
architectures, programming models based on DAG parallelism regained popularity in the …

Task‐based FMM for heterogeneous architectures

E Agullo, B Bramas, O Coulaud, E Darve… - Concurrency and …, 2016 - Wiley Online Library
High performance fast multipole method is crucial for the numerical simulation of many
physical problems. In a previous study, we have shown that task‐based fast multipole …

A supernodal all-pairs shortest path algorithm

P Sao, R Kannan, P Gera, R Vuduc - Proceedings of the 25th ACM …, 2020 - dl.acm.org
We show how to exploit graph sparsity in the Floyd-Warshall algorithm for the all-pairs
shortest path (Apsp) problem. Floyd-Warshall is an attractive choice for Apsp on high …

Performance analysis of tile low-rank cholesky factorization using parsec instrumentation tools

Q Cao, Y Pei, T Herault, K Akbudak… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
This paper highlights the necessary development of new instrumentation tools within the
PaRSE task-based runtime system to leverage the performance of low-rank matrix …

A visual performance analysis framework for task‐based parallel applications running on hybrid clusters

V Garcia Pinto, L Mello Schnorr… - Concurrency and …, 2018 - Wiley Online Library
Summary Programming paradigms in High‐Performance Computing have been shifting
toward task‐based models that are capable of adapting readily to heterogeneous and …

A framework to exploit data sparsity in tile low-rank cholesky factorization

Q Cao, R Alomairy, Y Pei, G Bosilca… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
We present a general framework that couples the PaRSEC runtime system and the HiCMA
numerical library to solve challenging 3D data-sparse problems. Though formally dense …