A comprehensive survey of load balancing strategies using hadoop queue scheduling and virtual machine migration

NS Dey, T Gunasekhar - IEEE Access, 2019 - ieeexplore.ieee.org
The recent growth in the demand for scalable applications from the consumers of the
services has motivated the application development community to build and deploy the …

Hopper: Decentralized speculation-aware cluster scheduling at scale

X Ren, G Ananthanarayanan, A Wierman… - Proceedings of the 2015 …, 2015 - dl.acm.org
As clusters continue to grow in size and complexity, providing scalable and predictable
performance is an increasingly important challenge. A crucial roadblock to achieving …

Energy-aware scheduling of mapreduce jobs for big data applications

L Mashayekhy, MM Nejad, D Grosu… - IEEE transactions on …, 2014 - ieeexplore.ieee.org
The majority of large-scale data intensive applications executed by data centers are based
on MapReduce or its open-source implementation, Hadoop. Such applications are executed …

Encoded bitmap indexing for data warehouses

MC Wu, AP Buchmann - Proceedings 14th International …, 1998 - ieeexplore.ieee.org
Complex query types, huge data volumes, and very high read/update ratios make the
indexing techniques designed and tuned for traditional database systems unsuitable for …

Wide-area analytics with multiple resources

CC Hung, G Ananthanarayanan, L Golubchik… - Proceedings of the …, 2018 - dl.acm.org
Running data-parallel jobs across geo-distributed sites has emerged as a promising
direction due to the growing need for geo-distributed cluster deployment. A key difference …

Fuzzy joins using mapreduce

FN Afrati, AD Sarma, D Menestrina… - 2012 IEEE 28th …, 2012 - ieeexplore.ieee.org
Fuzzy/similarity joins have been widely studied in the research community and extensively
used in real-world applications. This paper proposes and evaluates several algorithms for …

Dynamicmr: A dynamic slot allocation optimization framework for mapreduce clusters

S Tang, BS Lee, B He - IEEE Transactions on Cloud …, 2014 - ieeexplore.ieee.org
MapReduce is a popular computing paradigm for large-scale data processing in cloud
computing. However, the slot-based MapReduce system (eg, Hadoop MRv1) can suffer from …

Joint optimization of overlapping phases in MapReduce

M Lin, L Zhang, A Wierman, J Tan - ACM SIGMETRICS Performance …, 2014 - dl.acm.org
MapReduce is a scalable parallel computing framework for big data processing. It exhibits
multiple processing phases, and thus an efficient job scheduling mechanism is crucial for …

Two sides of a coin: Optimizing the schedule of mapreduce jobs to minimize their makespan and improve cluster performance

A Verma, L Cherkasova… - 2012 IEEE 20th …, 2012 - ieeexplore.ieee.org
Large-scale MapReduce clusters that routinely process petabytes of unstructured and semi-
structured data represent a new entity in the changing landscape of clouds. A key challenge …

Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle

A Verma, L Cherkasova, VS Kumar… - 2012 IEEE Network …, 2012 - ieeexplore.ieee.org
Hadoop and the associated MapReduce paradigm, has become the de facto platform for
cost-effective analytics over “Big Data”. There is an increasing number of MapReduce …