A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

S Usman, R Mehmood, I Katib, A Albeshri - Electronics, 2022 - mdpi.com
Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …

A comprehensive survey of load balancing strategies using hadoop queue scheduling and virtual machine migration

NS Dey, T Gunasekhar - IEEE Access, 2019 - ieeexplore.ieee.org
The recent growth in the demand for scalable applications from the consumers of the
services has motivated the application development community to build and deploy the …

Classification framework of MapReduce scheduling algorithms

N Tiwari, S Sarkar, U Bellur, M Indrawan - ACM Computing Surveys …, 2015 - dl.acm.org
A MapReduce scheduling algorithm plays a critical role in managing large clusters of
hardware nodes and meeting multiple quality requirements by controlling the order and …

Client-side scheduling based on application characterization on kubernetes

V Medel, C Tolón, U Arronategui… - Economics of Grids …, 2017 - Springer
In container management systems, such as Kubernetes, the scheduler has to place
containers in physical machines and it should be aware of the degradation in performance …

{ThroughputScheduler}: Learning to schedule on heterogeneous hadoop clusters

S Gupta, C Fritz, B Price, R Hoover, J Dekleer… - … Computing (ICAC 13), 2013 - usenix.org
Hadoop is the de-facto standard for big data analytics applications. Presently available
schedulers for Hadoop clusters assign tasks to nodes without regard to the capability of the …

Dynamically adaptive, resource aware system and method for scheduling

S Gupta, C Fritz, J De Kleer - US Patent 9,672,064, 2017 - Google Patents
The following relates generally to computer system effi ciency improvements. Broadly,
systems and methods are disclosed that improve efficiency in a cluster of nodes by efficient …

[PDF][PDF] A comprehensive view of Hadoop MapReduce scheduling algorithms

SR Pakize - International Journal of Computer Networks & …, 2014 - researchgate.net
Hadoop is a Java-based programming framework that supports the storing and processing
of large data sets in a distributed computing environment and it is very much appropriate for …

Performance improvement of Mapreduce for heterogeneous clusters based on efficient locality and replica aware scheduling (ELRAS) strategy

JV Bibal Benifa, Dejey - Wireless Personal Communications, 2017 - Springer
MapReduce is a parallel programming model for processing the data-intensive applications
in a cloud environment. The scheduler greatly influences the performance of MapReduce …

Performance evaluation of job schedulers on Hadoop YARN

JC Lin, MC Lee - Concurrency and Computation: Practice and …, 2016 - Wiley Online Library
To solve the limitation of Hadoop on scalability, resource sharing, and application support,
the open‐source community proposes the next generation of Hadoop's compute platform …