HSG-LM: hybrid-copy speculative guest OS live migration without hypervisor

P Lu, A Barbalace, B Ravindran - … of the 6th International Systems and …, 2013 - dl.acm.org
Current Virtual Machine (VM) live migration mechanisms only focus on providing a high
availability service by offering minimal downtime to users. In this paper, we present a novel …

[PDF][PDF] Lightweight checkpoint mechanism and modeling in GPGPU environment

S Laosooksathit, N Naksinehaboon… - Computing (HPC …, 2010 - Citeseer
Abstract While High Performance Computing (HPC) systems continue to scale in volume of
computing elements and overall computing powers, the performance/cost benefit of these …

CDMCR: multi‐level fault‐tolerant system for distributed applications in cloud

W Qiang, C Jiang, L Ran, D Zou… - Security and …, 2016 - Wiley Online Library
Cloud provides users with a new model of utilizing the computing infrastructure with the
ability to perform parallel and distributed computations using elastic virtual cluster. However …

Global snapshot of a distributed system running on virtual machines

CE Gómez, HE Castro… - 2017 29th International …, 2017 - ieeexplore.ieee.org
Recently, a new concept called desktop cloud emerged, which was developed to offer cloud
computing services on non-dedicated resources. Similarly to cloud computing, desktop …

A transparent hypervisor-level checkpoint-restart mechanism for a cluster of virtual machines

C Pechwises, K Chanchio - 2018 15th International Joint …, 2018 - ieeexplore.ieee.org
A cluster of virtual machines is a common platform for running MPI applications in cloud
computing environments. However, most traditional methods to provide fault tolerance to …

[PDF][PDF] A checkpointing mechanism for virtual clusters using memory-bound time-multiplexed data transfers

J Yaothanee, K Chanchio - International Journal of Electrical and …, 2024 - academia.edu
Transparent hypervisor-level checkpoint-restart mechanisms for virtual clusters (VCs) or
clusters of virtual machines (VMs) offer an attractive fault tolerance capability for cloud data …

A portable and adaptable fault tolerance solution for heterogeneous applications

N Losada, BB Fraguela, P González… - Journal of Parallel and …, 2017 - Elsevier
Heterogeneous systems have increased their popularity in recent years due to the high
performance and reduced energy consumption capabilities provided by using devices such …

Restricted simple disjunctive decompositions based on grouping symmetric variables

H Sawada, S Yamashita… - Proceedings Great Lakes …, 1997 - ieeexplore.ieee.org
This paper presents an efficient method for a simple disjunctive decomposition, where
candidates for the bound set are restricted to sets of symmetric variables to reduce the …

[PDF][PDF] Taxonomy of Contention Management in Interconnected Distributed Systems.

MA Salehi, JH Abawajy, R Buyya - 2014 - academia.edu
Interconnected distributed computing systems, such as computing Grids and federated
Clouds, have been of special importance in both industry and academia. Resources …

Two-level checkpoint/restart modeling for gpgpu

S Laosooksathit, N Naksinehaboon… - 2011 9th IEEE/ACS …, 2011 - ieeexplore.ieee.org
Due to the fact that the reliability and availability of a large scaled system inverse to the
number of computing elements, fault tolerance has become a major concern in high …