Methodological approach to data-centric cloudification of scientific iterative workflows
S Caíno-Lores, A Lapin, P Kropf, J Carretero - International Conference on …, 2016 - Springer
S Caíno-Lores, A Lapin, P Kropf, J Carretero
International Conference on Algorithms and Architectures for Parallel Processing, 2016•SpringerThe computational complexity and the constantly increasing amount of input data for
scientific computing models is threatening their scalability. In addition, this is leading
towards more data-intensive scientific computing, thus rising the need to combine
techniques and infrastructures from the HPC and big data worlds. This paper presents a
methodological approach to cloudify generalist iterative scientific workflows, with a focus on
improving data locality and preserving performance. To evaluate this methodology, it was …
scientific computing models is threatening their scalability. In addition, this is leading
towards more data-intensive scientific computing, thus rising the need to combine
techniques and infrastructures from the HPC and big data worlds. This paper presents a
methodological approach to cloudify generalist iterative scientific workflows, with a focus on
improving data locality and preserving performance. To evaluate this methodology, it was …
Abstract
The computational complexity and the constantly increasing amount of input data for scientific computing models is threatening their scalability. In addition, this is leading towards more data-intensive scientific computing, thus rising the need to combine techniques and infrastructures from the HPC and big data worlds. This paper presents a methodological approach to cloudify generalist iterative scientific workflows, with a focus on improving data locality and preserving performance. To evaluate this methodology, it was applied to an hydrological simulator, EnKF-HGS. The design was implemented using Apache Spark, and assessed in a local cluster and in Amazon Elastic Compute Cloud (EC2) against the original version to evaluate performance and scalability.
Springer