Otterman: A novel approach of spark auto-tuning by a hybrid strategy

H Du, P Han, W Chen, Y Wang… - 2018 5th International …, 2018 - ieeexplore.ieee.org
Spark has become a very attractive platform for big data analytics in recent years due to its
unique advantages such as parallelism, fault tolerance, and complexity associated with …

Spark parameter tuning via trial-and-error

P Petridis, A Gounaris, J Torres - INNS Conference on Big Data, 2016 - Springer
Spark has been established as an attractive platform for big data analysis, since it manages
to hide most of the complexities related to parallelism, fault tolerance and cluster setting from …

Tuning performance of Spark programs

H Zhang, Z Liu, L Wang - 2018 IEEE International Conference …, 2018 - ieeexplore.ieee.org
Along with the explosive growth of data, there is a great demand to speedup the ability to
process them. Although there are several platforms such as Spark that have made analysis …

Pets: Bottleneck-aware spark tuning with parameter ensembles

TBG Perez, W Chen, R Ji, L Liu… - 2018 27th International …, 2018 - ieeexplore.ieee.org
Spark tuning with its dozens of parameters for performance improvement is both a challenge
and time consuming effort. Current techniques rely on trial-and-error or best guess utilizing …

SMBSP: a self-tuning approach using machine learning to improve performance of spark in big data processing

MA Rahman, J Hossen… - 2018 7th International …, 2018 - ieeexplore.ieee.org
Apache Spark, popularly known for big data processing capability, is a distributed open-
source platform that uses the concept of distributed memory to facilitate big data processing …

Towards general and efficient online tuning for spark

Y Li, H Jiang, Y Shen, Y Fang, X Yang, D Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
The distributed data analytic system--Spark is a common choice for processing massive
volumes of heterogeneous data, while it is challenging to tune its parameters to achieve …

A methodology for spark parameter tuning

A Gounaris, J Torres - Big data research, 2018 - Elsevier
Spark has been established as an attractive platform for big data analysis, since it manages
to hide most of the complexities related to parallelism, fault tolerance and cluster setting from …

A novel method for tuning configuration parameters of spark based on machine learning

G Wang, J Xu, B He - … Conference on Smart City; IEEE 2nd …, 2016 - ieeexplore.ieee.org
Apache Spark is an open source distributed data processing platform, which can use
distributed memory abstraction to process large volume of data efficiently. With the …

Sparker: Optimizing spark for heterogeneous clusters

N Garg, D Janakiram - 2018 IEEE International Conference on …, 2018 - ieeexplore.ieee.org
Spark is an in-memory big data analytics framework which has replaced Hadoop as the de
facto standard for processing big data in cloud platforms. These frameworks run on cloud …

Auto-tuning spark configurations based on neural network

J Gu, Y Li, H Tang, Z Wu - 2018 IEEE International Conference …, 2018 - ieeexplore.ieee.org
For massive data processing platforms such as Spark, configuration tuning is a necessary
step since it is closely related to task parallelism, resource allocation and fault tolerance …