rMPI: increasing fault resiliency in a message-passing environment.
… As an alternative, we present a method to increase resilience … This approach effectively
increases the time between faults … of some faults, it is now possible to increase the checkpoint …
increases the time between faults … of some faults, it is now possible to increase the checkpoint …
Increasing fault resiliency in a message-passing environment.
… We use signals to alert rMPI of failed nodes and can thus simulate the failure of nodes for
testing purposes. However, since all nodes still are part of a complete MPI application and due …
testing purposes. However, since all nodes still are part of a complete MPI application and due …
Toward a scalable, transactional, fault-tolerant message passing interface for petascale and exascale machines
A Hassani - 2016 - search.proquest.com
… and scalable fault-tolerant environment requires the … on resilience in large-scale computing
with the focus of fault-… rMPI [9] is a fault-tolerant implementation of MPI through process …
with the focus of fault-… rMPI [9] is a fault-tolerant implementation of MPI through process …
Legio: fault resiliency for embarrassingly parallel MPI applications
… File and remote memory access operations are not supported by ULFM and are likely to
fail in a faulty environment (rather than raising an error, they throw a segmentation …
fail in a faulty environment (rather than raising an error, they throw a segmentation …
[PDF][PDF] Redundant execution of HPC applications with MR-MPI
C Engelmann, S Böhm - Proceedings of the 10th IASTED …, 2011 - christian-engelmann.info
… problems, rMPI does not support certain complex MPI communicator operations, such as MPI
Comm split(). rMPI … Increasing fault resiliency in a messagepassing environment. Technical …
Comm split(). rMPI … Increasing fault resiliency in a messagepassing environment. Technical …
Transparent redundant computing with MPI
… the overhead of providing resiliency through redundancy. … ideas that could improve the
performance of rMPI and limit … Increasing fault resiliency in a message-passing environment. …
performance of rMPI and limit … Increasing fault resiliency in a message-passing environment. …
Asynchronous Teams and Tasks in a Message Passing Environment
B Hazelwood - 2019 - etheses.dur.ac.uk
… must become tolerant to such faults. With large numbers of … contributes resiliency features
to the “Exascale Environment” … were also released following rMPI in 2011. After reiterating …
to the “Exascale Environment” … were also released following rMPI in 2011. After reiterating …
[HTML][HTML] Lazy Fault Recovery for Redundant MPI
E Saliba - 2019 - search.proquest.com
… languages for DS is Message Passing Interface (MPI). … rely on any specific MPI library
whereas rMPI is specific to MPICH. … , then explain the need for resilience in this library, and finally …
whereas rMPI is specific to MPICH. … , then explain the need for resilience in this library, and finally …
Fault-aware group-collective communication creation and repair in mpi
… We integrate our solution into an existing fault-resiliency … of fault management techniques in
the Message Passing Interface (… to fault presence consists of letting the MPI call raise an error…
the Message Passing Interface (… to fault presence consists of letting the MPI call raise an error…
A robust and efficient message passing library for volunteer computing environments
… MPI/FT [9], P2P-MPI [19] and rMPI [16] are MPI libraries that provide fault-tolerance through
… failure or major performance degradations. Furthermore, we demonstrated that increasing …
… failure or major performance degradations. Furthermore, we demonstrated that increasing …