A survey of field-based testing techniques
Field testing refers to testing techniques that operate in the field to reveal those faults that
escape in-house testing. Field testing techniques are becoming increasingly popular with …
escape in-house testing. Field testing techniques are becoming increasingly popular with …
Addressing failures in exascale computing
We present here a report produced by a workshop on 'Addressing failures in exascale
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …
Scalable temporal order analysis for large scale debugging
We present a scalable temporal order analysis technique that supports debugging of large
scale applications by classifying MPI tasks based on their logical program execution order …
scale applications by classifying MPI tasks based on their logical program execution order …
Debugging high-performance computing applications at massive scales
Debugging high-performance computing applications at massive scales Page 1 72
COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 DOI:10.1145/2667219 …
COMMUNICATIONS OF THE ACM | SEPTEMBER 2015 | VOL. 58 | NO. 9 DOI:10.1145/2667219 …
Large scale debugging of parallel tasks with automaded
Developing correct HPC applications continues to be a challenge as the number of cores
increases in today's largest systems. Most existing debugging techniques perform poorly at …
increases in today's largest systems. Most existing debugging techniques perform poorly at …
Diagnosing performance bottlenecks in emerging petascale applications
Cutting-edge science and engineering applications require petascale computing. It is,
however, a significant challenge to use petascale computing platforms effectively …
however, a significant challenge to use petascale computing platforms effectively …
Vrisha: using scaling properties of parallel programs for bug detection and localization
Detecting and isolating bugs that arise in parallel programs is a tedious and a challenging
task. An especially subtle class of bugs are those that are scale-dependent: while small …
task. An especially subtle class of bugs are those that are scale-dependent: while small …
Dyninst and mrnet: Foundational infrastructure for parallel tools
Parallel tools require common pieces of infrastructure: the ability to control, monitor, and
instrument programs, and the ability to massively scale these operations as the application …
instrument programs, and the ability to massively scale these operations as the application …
WuKong: automatically detecting and localizing bugs that manifest at large system scales
A key challenge in developing large scale applications is finding bugs that are latent at the
small scales of testing, but manifest themselves when the application is deployed at a large …
small scales of testing, but manifest themselves when the application is deployed at a large …
Scalable performance analysis of exascale mpi programs through signature-based clustering algorithms
Extreme-scale computing poses a number of challenges to application performance.
Developers need to study application behavior by collecting detailed information with the …
Developers need to study application behavior by collecting detailed information with the …