Towards dense linear algebra for hybrid GPU accelerated manycore systems S Tomov, J Dongarra, M Baboulin Parallel Computing 36 (5-6), 232-240, 2010 | 594 | 2010 |
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects E Agullo, J Demmel, J Dongarra, B Hadri, J Kurzak, J Langou, H Ltaief, ... Journal of Physics: Conference Series 180 (1), 012037, 2009 | 581 | 2009 |
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming P Du, R Weber, P Luszczek, S Tomov, G Peterson, J Dongarra Parallel Computing 38 (8), 391-407, 2012 | 475 | 2012 |
GPU Computing Gems Jade Edition (Applications of GPU Computing Series) WW Hwu, editor Morgan Kaufmann Publishers Inc., 2011 | 392* | 2011 |
Dense linear algebra solvers for multicore with GPU accelerators S Tomov, R Nath, H Ltaief, J Dongarra 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 347 | 2010 |
An improved magma gemm for fermi graphics processing units R Nath, S Tomov, J Dongarra The International Journal of High Performance Computing Applications 24 (4 …, 2010 | 294 | 2010 |
Accelerating scientific computations with mixed precision algorithms M Baboulin, A Buttari, J Dongarra, J Kurzak, J Langou, J Langou, ... Computer Physics Communications 180 (12), 2526-2533, 2009 | 272 | 2009 |
Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers A Haidar, S Tomov, J Dongarra, NJ Higham SC18: International Conference for High Performance Computing, Networking …, 2018 | 244 | 2018 |
A note on auto-tuning GEMM for GPUs Y Li, J Dongarra, S Tomov Computational science–ICCS 2009: 9th international conference Baton Rouge …, 2009 | 236 | 2009 |
The impact of multicore on math software A Buttari, J Dongarra, J Kurzak, J Langou, P Luszczek, S Tomov International Workshop on Applied Parallel Computing, 1-10, 2006 | 178 | 2006 |
A hybridization methodology for high-performance linear algebra software for GPUs E Agullo, C Augonnet, J Dongarra, H Ltaief, R Namyst, S Thibault, ... GPU Computing Gems Jade Edition, 473-484, 2012 | 162 | 2012 |
Autotuning GEMM kernels for the Fermi GPU J Kurzak, S Tomov, J Dongarra IEEE Transactions on Parallel and Distributed Systems 23 (11), 2045-2057, 2012 | 153 | 2012 |
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy A Buttari, J Dongarra, J Kurzak, P Luszczek, S Tomov ACM Transactions on Mathematical Software (TOMS) 34 (4), 1-22, 2008 | 151 | 2008 |
QR factorization on a multicore node enhanced with multiple GPU accelerators E Agullo, C Augonnet, J Dongarra, M Faverge, H Ltaief, S Thibault, ... 2011 IEEE International Parallel & Distributed Processing Symposium, 932-943, 2011 | 145 | 2011 |
Performance, design, and autotuning of batched GEMM for GPUs A Abdelfattah, A Haidar, S Tomov, J Dongarra High Performance Computing: 31st International Conference, ISC High …, 2016 | 133 | 2016 |
Accelerating numerical dense linear algebra calculations with GPUs J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ... Numerical computations with GPUs, 3-28, 2014 | 133 | 2014 |
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, A Fox, ... The International Journal of High Performance Computing Applications 35 (4 …, 2021 | 119 | 2021 |
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems F Song, S Tomov, J Dongarra Proceedings of the 26th ACM international conference on Supercomputing, 365-376, 2012 | 116 | 2012 |
Parallel performance measurement of heterogeneous parallel systems with gpus AD Malony, S Biersdorff, S Shende, H Jagode, S Tomov, G Juckeland, ... 2011 international conference on parallel processing, 176-185, 2011 | 116 | 2011 |
Power aware computing on GPUs K Kasichayanula, D Terpstra, P Luszczek, S Tomov, S Moore, ... 2012 Symposium on Application Accelerators in High Performance Computing, 64-73, 2012 | 113 | 2012 |