Finding the right cloud configuration for analytics clusters

M Bilal, M Canini, R Rodrigues - … of the 11th ACM Symposium on Cloud …, 2020 - dl.acm.org
Finding good cloud configurations for deploying a single distributed system is already a
challenging task, and it becomes substantially harder when a data analytics cluster is …

Monkeyking: Adaptive parameter tuning on big data platforms with deep reinforcement learning

H Du, P Han, Q Xiang, S Huang - Big Data, 2020 - liebertpub.com
Choosing the right parameter configurations for recurring jobs running on big data analytics
platforms is difficult because there can be hundreds of possible parameter configurations to …

Auto tuning of hadoop and spark parameters

T Patanshetti, AA Pawar, D Patel, S Thakare - arXiv preprint arXiv …, 2021 - arxiv.org
Data of the order of terabytes, petabytes, or beyond is known as Big Data. This data cannot
be processed using the traditional database software, and hence there comes the need for …

Algorithmic Proficiency in Spark Configuration Tuning: An Empirical Study using Execution Time Metrics across Varied Workloads

P Sewal, H Singh - Procedia Computer Science, 2024 - Elsevier
In the realm of big data, where datasets of immense scale pose processing challenges,
distributed processing platforms like open-source Apache Spark have emerged to address …

Spark performance optimization analysis in memory management with deploy mode in standalone cluster computing

DM Adinew, Z Shijie, Y Liao - 2020 IEEE 36th International …, 2020 - ieeexplore.ieee.org
As data is growing in different dimensions, it is difficult to get appropriate data analytic tools.
Spark is one of high speed" in-memory computing" big data analytic tool designed to …

Systems and methods of resource configuration optimization for machine learning workloads

L Cao, F Ahmed, P Sharma - US Patent 11,797,340, 2023 - Google Patents
Abstract Systems and methods are provided for optimally allocating resources used to
perform multiple tasks/jobs, eg, machine learning training jobs. The possible resource …

Nostop: A novel configuration optimization scheme for Spark Streaming

Q Ye, W Liu, CQ Wu - Proceedings of the 50th International Conference …, 2021 - dl.acm.org
An increasing number of big data applications in various domains generate datasets
continuously, which must be processed for various purposes in a timely manner. As one of …

Spark performance optimization analysis in memory tuning on gc overhead for big data analytics

DM Adinew, Z Shijie, Y Liao - Proceedings of the 2019 8th International …, 2019 - dl.acm.org
Apache spark is one of the high speed" in-memory computing" that run over the JVM. Due to
increasing data in volume, it needs performance optimization mechanism that requires …

[PDF][PDF] Automatic configuration for cloud workloads.

M Bilal - 2022 - dial.uclouvain.be
With the increase in the adoption of Cloud computing and big data processing systems, a
common way to deploy analytics workloads is to acquire on-demand resources in a Cloud …

Constraint-aware performance autotuning in live production environment

S Cereda - 2022 - politesi.polimi.it
Modern IT systems offer hundreds of tunable knobs that impact their performances. Manually
finding a well-performing configuration is a daunting task, especially when considering that …