Resource management for Infrastructure as a Service (IaaS) in cloud computing: A survey

SS Manvi, GK Shyam - Journal of network and computer applications, 2014 - Elsevier
The cloud phenomenon is quickly becoming an important service in Internet computing.
Infrastructure as a Service (IaaS) in cloud computing is one of the most significant and …

[PDF][PDF] 大数据分析——RDBMS 与MapReduce 的竞争与共生

覃雄派, 王会举, 杜小勇, 王珊 - 软件学报, 2012 - jos.org.cn
在科学研究, 计算机仿真, 互联网应用, 电子商务等诸多应用领域, 数据量正在以极快的速度增长,
为了分析和利用这些庞大的数据资源, 必须依赖有效的数据分析技术. 传统的关系数据管理技术 …

A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

[PDF][PDF] Improving MapReduce performance in heterogeneous environments.

M Zaharia, A Konwinski, AD Joseph, RH Katz, I Stoica - Osdi, 2008 - usenix.org
MapReduce is emerging as an important programming model for large-scale data-parallel
applications such as web indexing, data mining, and scientific simulation. Hadoop is an …

More for your money: exploiting performance heterogeneity in public clouds

B Farley, A Juels, V Varadarajan, T Ristenpart… - Proceedings of the …, 2012 - dl.acm.org
Infrastructure-as-a-system compute clouds such as Amazon's EC2 allow users to pay a flat
hourly rate to run their virtual machine (VM) on a server providing some combination of CPU …

Maestro: Replica-aware map scheduling for mapreduce

S Ibrahim, H Jin, L Lu, B He… - 2012 12th IEEE/ACM …, 2012 - ieeexplore.ieee.org
MapReduce has emerged as a leading programming model for data-intensive computing.
Many recent research efforts have focused on improving the performance of the distributed …

SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

R Gu, X Yang, J Yan, Y Sun, B Wang, C Yuan… - Journal of parallel and …, 2014 - Elsevier
As a widely-used parallel computing framework for big data processing today, the Hadoop
MapReduce framework puts more emphasis on high-throughput of data than on low-latency …

Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

S Usman, R Mehmood, I Katib, A Albeshri - Electronics, 2022 - mdpi.com
Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …

Budget-driven scheduling algorithms for batches of MapReduce jobs in heterogeneous clouds

Y Wang, W Shi - IEEE Transactions on Cloud Computing, 2014 - ieeexplore.ieee.org
In this paper, we consider task-level scheduling algorithms with respect to budget and
deadline constraints for a batch of MapReduce jobs on a set of provisioned heterogeneous …

Flutter: Scheduling tasks closer to data across geo-distributed datacenters

Z Hu, B Li, J Luo - IEEE INFOCOM 2016-The 35th Annual IEEE …, 2016 - ieeexplore.ieee.org
Typically called big data processing, processing large volumes of data from geographically
distributed regions with machine learning algorithms has emerged as an important …