Evaluating the viability of process replication reliability for exascale systems K Ferreira, J Stearley, JH Laros III, R Oldfield, K Pedretti, R Brightwell, ... Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 346 | 2011 |
MRNet: A software-based multicast/reduction network for scalable tools. In 2003 ACM PC Roth, DC Arnold, BP Miller IEEE conference on Supercomputing (SC’03), 21, 2003 | 313* | 2003 |
Stack trace analysis for large scale debugging DC Arnold, DH Ahn, BR De Supinski, GL Lee, BP Miller, M Schulz 2007 IEEE International Parallel and Distributed Processing Symposium, 1-10, 2007 | 232* | 2007 |
Users’ Guide to NetSolve V2. 0 D Arnold, S Agrawal, S Blackford, J Dongarra, C Fabianek, T Hiroyasu, ... Univ. of Tennessee, 2004 | 135* | 2004 |
On the viability of compression for reducing the overheads of checkpoint/restart-based fault tolerance D Ibtesham, D Arnold, PG Bridges, KB Ferreira, R Brightwell 2012 41st international conference on parallel processing, 148-157, 2012 | 103* | 2012 |
Request sequencing: Optimizing communication for the Grid DC Arnold, D Bachmann, J Dongarra European Conference on Parallel Processing, 1213-1222, 2000 | 74 | 2000 |
Innovations of the NetSolve grid computing system DC Arnold, H Casanova, J Dongarra Concurrency and computation: practice and experience 14 (13‐15), 1457-1479, 2002 | 70 | 2002 |
libhashckpt: hash-based incremental checkpointing using gpu’s KB Ferreira, R Riesen, R Brighwell, P Bridges, D Arnold European MPI Users' Group Meeting, 272-281, 2011 | 68 | 2011 |
Lessons learned at 208k: towards debugging millions of cores GL Lee, DH Ahn, DC Arnold, BR De Supinski, M Legendre, BP Miller, ... SC'08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 1-9, 2008 | 65 | 2008 |
Alleviating scalability issues of checkpointing protocols R Riesen, K Ferreira, D Da Silva, P Lemarinier, D Arnold, PG Bridges SC'12: Proceedings of the International Conference on High Performance …, 2012 | 52 | 2012 |
A framework for scalable, parallel performance monitoring A Nataraj, AD Malony, A Morris, DC Arnold, BP Miller Concurrency and Computation: Practice and Experience 22 (6), 720-735, 2010 | 42* | 2010 |
The netsolve environment: Progressing towards the seamless grid DC Arnold, J Dongarra Proceedings 2000. International Workshop on Parallel Processing, 199-206, 2000 | 41 | 2000 |
Using Simulation to Evaluate the Performance of Resilience Strategies at Scale S Levy, B Topp, D Arnold, KB Ferreira, P Widener, T Hoefler 4th International Workshop on Performance Modeling, Benchmarking and …, 2013 | 40 | 2013 |
Using simulation to explore distributed key-value stores for extreme-scale system services K Wang, A Kulkarni, M Lang, D Arnold, I Raicu Proceedings of the International Conference on High Performance Computing …, 2013 | 36 | 2013 |
Understanding the effects of communication and coordination on checkpointing at scale KB Ferreira, P Widener, S Levy, D Arnold, T Hoefler SC'14: Proceedings of the International Conference for High Performance …, 2014 | 35 | 2014 |
Does partial replication pay off? J Stearley, K Ferreira, D Robinson, J Laros, K Pedretti, D Arnold, ... IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012 | 35 | 2012 |
Tree-based overlay networks for scalable applications DC Arnold, GD Pack, BP Miller 20th International Parallel and Distributed Processing Symposium, 2006 …, 2006 | 33* | 2006 |
Middleware for the use of storage in communication M Beck, D Arnold, A Bassi, F Berman, H Casanova, J Dongarra, T Moore, ... Parallel Computing 28 (12), 1773-1787, 2002 | 33* | 2002 |
Improving MPI multi-threaded RMA communication performance N Hjelm, MGF Dosanjh, RE Grant, T Groves, P Bridges, D Arnold Proceedings of the 47th International Conference on Parallel Processing, 1-11, 2018 | 32 | 2018 |
On the convergence of computational and data grids DC Arnold, SS Vahdiyar, JJ Dongarra Parallel Processing Letters 11 (02n03), 187-202, 2001 | 32 | 2001 |