Performance optimization of Spark MLlib workloads using cost efficient RICG model on exponential projective sampling

P Sewal, H Singh - Cluster Computing, 2024 - Springer
The performance optimization of Apache Spark, a widely used distributed computing
framework, is crucial for the efficient execution of data-intensive workloads. However, the …

Semantic Feature-Driven Automatic Parameter Optimization of Apache Spark

Q Zou, X Rong, J Yu - 2024 4th International Conference on …, 2024 - ieeexplore.ieee.org
Big data analysis frameworks such as Spark are widely used in various scenarios, and the
configuration of Spark significantly affects the execution time of the application. How to …