ConEx: Efficient exploration of big-data system configurations for better performance
R Krishna, C Tang, K Sullivan… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Configuration space complexity makes the big-data software systems hard to configure well.
Consider Hadoop, with over nine hundred parameters, developers often just use the default …
Consider Hadoop, with over nine hundred parameters, developers often just use the default …
[PDF][PDF] A survey of machine learning techniques for self-tuning Hadoop performance
The Apache Hadoop framework is an open source implementation of MapReduce for
processing and storing big data. However, to get the best performance from this is a big …
processing and storing big data. However, to get the best performance from this is a big …
Improving the performance of Hadoop MapReduce Applications via Optimization of concurrent containers per Node
TT Htay, S Phyu - 2020 IEEE Conference on Computer …, 2020 - ieeexplore.ieee.org
Apache Hadoop is a distributed platform for storing, processing and analyzing of big data on
commodity machines. Hadoop has tunable parameters and they affect the performance of …
commodity machines. Hadoop has tunable parameters and they affect the performance of …
Noninvasive MapReduce performance tuning using multiple tuning methods on Hadoop
There are more than 190 configuration parameters affecting the performance of MapReduce
jobs on Hadoop. It is time-consuming and tedious for general users who have no deep …
jobs on Hadoop. It is time-consuming and tedious for general users who have no deep …
Hadoop on named data networking: Experience and results
The Named Data Networking (NDN) architecture retrieves content by names rather than
connecting to specific hosts. It provides benefits such as highly efficient and resilient content …
connecting to specific hosts. It provides benefits such as highly efficient and resilient content …
An open source project for tuning and analyzing mapreduce performance in Hadoop and Spark
D Chen, R Zhang - IEEE Software, 2020 - ieeexplore.ieee.org
MapReduce parameter tuning is time consuming, and existing tuning systems are difficult to
use. We present an open source project, Catla for Hadoop and Spark, to provide …
use. We present an open source project, Catla for Hadoop and Spark, to provide …
An open-source project for MapReduce performance self-tuning
D Chen - arXiv preprint arXiv:1912.12456, 2019 - arxiv.org
Many Hadoop configuration parameters have significant influence in the performance of
running MapReduce jobs on Hadoop. It is time-consuming and tedious for general users to …
running MapReduce jobs on Hadoop. It is time-consuming and tedious for general users to …
FOCUS: LESSONS LEARNED IN DEVOPS FEATURE: PERFORMANCE TUNING
R Zhang - COLLABORATIVE ASPECTS OF OPEN DATA IN SE, 2022 - computer.org
Methods Architecture Figure 1 illustrates the Catla-HS architecture, which facilitates the
efficient tuning of configuration parameters in a flexible and automated manner, solving the …
efficient tuning of configuration parameters in a flexible and automated manner, solving the …
Towards Performance Optimization for Hadoop MapReduce Applications
TT Htay, S Phyu - 2020 17th International Conference on …, 2020 - ieeexplore.ieee.org
Apache Hadoop is a widely used open-source distributed platform towards big data
processing and provides YARN based distributed parallel processing framework on low cost …
processing and provides YARN based distributed parallel processing framework on low cost …
Approaches for fast similarity search with MapReduce/Author Trong Nhan Phan
TN Phan - 2016 - epub.jku.at
Similarity search is the principle operation not only in databases but also in disciplinary
majors such as information retrieval, machine learning, or data mining. In addition, it has …
majors such as information retrieval, machine learning, or data mining. In addition, it has …