A survey of data-intensive scientific workflow management

J Liu, E Pacitti, P Valduriez, M Mattoso - Journal of Grid Computing, 2015 - Springer
Nowadays, more and more computer-based scientific experiments need to handle massive
amounts of data. Their data processing consists of multiple computational steps and …

Toward data lakes as central building blocks for data management and analysis

P Wieder, H Nolte - Frontiers in big Data, 2022 - frontiersin.org
Data lakes are a fundamental building block for many industrial data analysis solutions and
becoming increasingly popular in research. Often associated with big data use cases, data …

A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds

D de Oliveira, KACS Ocaña, F Baião… - Journal of grid …, 2012 - Springer
In the last years, scientific workflows have emerged as a fundamental abstraction for
structuring and executing scientific experiments in computational environments. Scientific …

Comparing futuregrid, amazon ec2, and open science grid for scientific workflows

G Juve, M Rynge, E Deelman… - … in Science & …, 2013 - ieeexplore.ieee.org
Scientists have many computing infrastructures available to conduct their research,
including grids and public or private clouds. This article explores the use of these cyber …

Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows

D De Oliveira, KACS Ocana, E Ogasawara… - Future Generation …, 2013 - Elsevier
Data analysis is an exploratory process that demands high performance computing (HPC).
SciPhylomics, for example, is a data-intensive workflow that aims at producing …

Beeflow: A workflow management system for in situ processing across hpc and cloud systems

J Chen, Q Guan, Z Zhang, X Liang… - 2018 IEEE 38th …, 2018 - ieeexplore.ieee.org
In this paper, we propose BeeFlow-an in situ analysis enabled workflow management
system across multiple platforms using Docker containers. BeeFlow can support both …

Cloud autoscaling simulation based on queueing network model

T Vondra, J Šedivý - Simulation Modelling Practice and Theory, 2017 - Elsevier
For the development of a predictive autoscaler for private clouds, an evaluation method was
needed. A survey of available tools was made, but none were found suitable. The …

[PDF][PDF] Big data workflows: A reference architecture and the DATAVIEW system

A Kashlev, S Lu, A Mohan - Services Transactions on Big Data …, 2017 - amohan.mcm.edu
The big data era is here, a natural result of the digital revolution of the last few decades. The
emergence of big data in virtually all areas of life raises a fundamental question-how can we …

A method for trust quantification in cloud computing environments

X Li, J He, B Zhao, J Fang, Y Zhang… - International Journal of …, 2016 - journals.sagepub.com
Cloud computing and Internet of Things (IoT) are emerging technologies that have
experienced rapid development in recent years. While cloud computing presents a new …

A reinforcement learning scheduling strategy for parallel cloud-based workflows

A Nascimento, V Olimpio, V Silva… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Scientific experiments can be modeled as Workflows. Such Workflows are usually
computing-and data-intensive, demanding the use of High-Performance Computing …