Satin: A high-level and efficient grid programming model

RV Van Nieuwpoort, G Wrzesińska… - ACM Transactions on …, 2010 - dl.acm.org
Computational grids have an enormous potential to provide compute power. However, this
power remains largely unexploited today for most applications, except trivially parallel …

Task-level resilience: checkpointing vs. supervision

J Posner, L Reitz, C Fohry - International Journal of Networking and …, 2022 - jstage.jst.go.jp
With the advent of exascale computing, issues such as application irregularity and
permanent hardware failure are growing in importance. Irregularity is often addressed by …

Checkpointing vs. supervision resilience approaches for dynamic independent tasks

J Posner, L Reitz, C Fohry - 2021 IEEE International Parallel …, 2021 - ieeexplore.ieee.org
With the advent of exascale computing, issues such as application irregularity and
permanent hardware failure are growing in importance. Irregularity is often addressed by …

Fault tolerance schemes for global load balancing in X10

C Fohry, M Bungart, J Posner - Scalable Computing: Practice and …, 2015 - scpe.org
Scalability postulates fault tolerance to be efficient. One approach handles permanent node
failures at user level. It is supported by Resilient X10, a Partitioned Global Address Space …

Load Balancing, Fault Tolerance, and Resource Elasticity for Asynchronous Many-Task Systems

J Posner - 2021 - kobra.uni-kassel.de
Abstract High-Performance Computing (HPC) enables solving complex problems from
various scientific fields including key societal problems such as COVID-19. Recently …

Fehlertoleranz und Elastizität für ein Framework zur globalen Lastenbalancierung

M Bungart - 2018 - kobra.uni-kassel.de
Zusammenfassung Die Anzahl an Rechenknoten in Hochleistungsrechnern wächst stetig. In
solchen Systemen nimmt die Bedeutung von Fehlertoleranz zu, da die Wahrscheinlichkeit …

Study of enterprise load balancing algorithms using model-based design

KA Aidarov, AZ Almatov - Journal of Mathematics, Mechanics and …, 2017 - bm.kaznu.kz
Given work describes load balancing algorithms for external services with unspecified
clients used in real enterprise facilities. Simplest example of such service is a pair of a web …

FT-PAS-A framework for pattern specific fault-tolerance in parallel programming

G Jakadeesan - 2009 - spectrum.library.concordia.ca
Fault-tolerance is an important requirement for long running parallel applications. Many
approaches are discussed in various literatures about providing fault-tolerance for parallel …

[引用][C] Load Balancing, Fault Tolerance, and Resource Elasticity for Asynchronous Many-Task Systems

C Fohry, M Schulz

[引用][C] Design and evaluation of a work stealing-based fault tolerance scheme for task pools

L Reitz - Mastersthesis, University of Kassel, 2019