ConEx: Efficient exploration of big-data system configurations for better performance

R Krishna, C Tang, K Sullivan… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Configuration space complexity makes the big-data software systems hard to configure well.
Consider Hadoop, with over nine hundred parameters, developers often just use the default …

[PDF][PDF] A survey of machine learning techniques for self-tuning Hadoop performance

MA Rahman, J Hossen… - … Journal of Electrical …, 2018 - researchgate.net
The Apache Hadoop framework is an open source implementation of MapReduce for
processing and storing big data. However, to get the best performance from this is a big …

Improving the performance of Hadoop MapReduce Applications via Optimization of concurrent containers per Node

TT Htay, S Phyu - 2020 IEEE Conference on Computer …, 2020 - ieeexplore.ieee.org
Apache Hadoop is a distributed platform for storing, processing and analyzing of big data on
commodity machines. Hadoop has tunable parameters and they affect the performance of …

Noninvasive MapReduce performance tuning using multiple tuning methods on Hadoop

D Chen, R Zhang, RG Qiu - IEEE Systems Journal, 2020 - ieeexplore.ieee.org
There are more than 190 configuration parameters affecting the performance of MapReduce
jobs on Hadoop. It is time-consuming and tedious for general users who have no deep …

Hadoop on named data networking: Experience and results

M Gibbens, C Gniady, L Ye, B Zhang - … of the ACM on Measurement and …, 2017 - dl.acm.org
The Named Data Networking (NDN) architecture retrieves content by names rather than
connecting to specific hosts. It provides benefits such as highly efficient and resilient content …

An open source project for tuning and analyzing mapreduce performance in Hadoop and Spark

D Chen, R Zhang - IEEE Software, 2020 - ieeexplore.ieee.org
MapReduce parameter tuning is time consuming, and existing tuning systems are difficult to
use. We present an open source project, Catla for Hadoop and Spark, to provide …

An open-source project for MapReduce performance self-tuning

D Chen - arXiv preprint arXiv:1912.12456, 2019 - arxiv.org
Many Hadoop configuration parameters have significant influence in the performance of
running MapReduce jobs on Hadoop. It is time-consuming and tedious for general users to …

FOCUS: LESSONS LEARNED IN DEVOPS FEATURE: PERFORMANCE TUNING

R Zhang - COLLABORATIVE ASPECTS OF OPEN DATA IN SE, 2022 - computer.org
Methods Architecture Figure 1 illustrates the Catla-HS architecture, which facilitates the
efficient tuning of configuration parameters in a flexible and automated manner, solving the …

Towards Performance Optimization for Hadoop MapReduce Applications

TT Htay, S Phyu - 2020 17th International Conference on …, 2020 - ieeexplore.ieee.org
Apache Hadoop is a widely used open-source distributed platform towards big data
processing and provides YARN based distributed parallel processing framework on low cost …

Approaches for fast similarity search with MapReduce/Author Trong Nhan Phan

TN Phan - 2016 - epub.jku.at
Similarity search is the principle operation not only in databases but also in disciplinary
majors such as information retrieval, machine learning, or data mining. In addition, it has …