Rolex: Resilience-oriented language extensions for extreme-scale systems
S Hukerikar, RF Lucas - The Journal of Supercomputing, 2016 - Springer
Future exascale high-performance computing (HPC) systems will be constructed from VLSI
devices that will be less reliable than those used today, and faults will become the norm, not …
devices that will be less reliable than those used today, and faults will become the norm, not …
A self-correcting connected components algorithm
We present a new fault-tolerant algorithm for the problem of computing the connected
components of a graph. Our algorithm derives from a highly parallel but non-resilient …
components of a graph. Our algorithm derives from a highly parallel but non-resilient …
Introspective resilience for exascale high-performance computing systems
S Hukerikar - 2015 - search.proquest.com
Future exascale high-performance computing (HPC) systems will be constructed using VLSI
devices with smaller feature sizes that will be far less reliable than those used today …
devices with smaller feature sizes that will be far less reliable than those used today …