A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems
Abstract In recent years, High Performance Computing (HPC) systems have been shifting
from expensive massively parallel architectures to clusters of commodity PCs to take …
from expensive massively parallel architectures to clusters of commodity PCs to take …
A roadmap toward the resilient internet of things for cyber-physical systems
The Internet of Things (IoT) is a ubiquitous system connecting many different devices-the
things-which can be accessed from the distance. The cyber-physical systems (CPSs) …
things-which can be accessed from the distance. The cyber-physical systems (CPSs) …
Methods, media and systems for responding to a denial of service attack
Methods, media and systems for responding to a Denial of Service (DoS) attack are
provided. In some embodiments, a method includes detecting a DoS attack, migrating one or …
provided. In some embodiments, a method includes detecting a DoS attack, migrating one or …
[PDF][PDF] Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems.
O Laadan, J Nieh - USENIX Annual Technical Conference, 2007 - usenix.org
The ability to checkpoint a running application and restart it later can provide many useful
benefits including fault recovery, advanced resources sharing, dynamic load balancing and …
benefits including fault recovery, advanced resources sharing, dynamic load balancing and …
Systems, methods, means, and media for recording, searching, and outputting display information
A portion of the disclosure of this patent document con tains material which is Subject to
copyright protection. The copyright owner has no objection to the facsimile reproduc tion by …
copyright protection. The copyright owner has no objection to the facsimile reproduc tion by …
Fault tolerance to balance for messaging layers in communication society
A Mikhail, HH Kareem… - … International conference on …, 2017 - ieeexplore.ieee.org
The present communication societies are based on use of High-Performance Computing
(HPC) systems for balancing the messaging layers. However the HPC systems are …
(HPC) systems for balancing the messaging layers. However the HPC systems are …
Methods, media and systems for managing a distributed application running in a plurality of digital processing devices
Methods, media and systems for managing a distributed application running in a plurality of
digital processing devices are provided. In some embodiments, a method includes run ning …
digital processing devices are provided. In some embodiments, a method includes run ning …
[PDF][PDF] Linux-CR: Transparent application checkpoint-restart in Linux
O Laadan, SE Hallyn - Linux Symposium, 2010 - Citeseer
Application checkpoint-restart is the ability to save the state of a running application so that it
can later resume its execution from the time of the checkpoint. Application checkpoint-restart …
can later resume its execution from the time of the checkpoint. Application checkpoint-restart …
Flux: Multi-surface computing in Android
With the continued proliferation of mobile devices, apps will increasingly become multi-
surface, running seamlessly across multiple user devices (eg, phone, tablet, etc.). Yet …
surface, running seamlessly across multiple user devices (eg, phone, tablet, etc.). Yet …
Lightweight memory checkpointing
Memory check pointing is a pivotal technique in systems reliability, with applications ranging
from crash recovery to replay debugging. Unfortunately, many traditional memory check …
from crash recovery to replay debugging. Unfortunately, many traditional memory check …