Spark parameter tuning via trial-and-error

P Petridis, A Gounaris, J Torres - INNS Conference on Big Data, 2016 - Springer
Spark has been established as an attractive platform for big data analysis, since it manages
to hide most of the complexities related to parallelism, fault tolerance and cluster setting from …

A methodology for spark parameter tuning

A Gounaris, J Torres - Big data research, 2018 - Elsevier
Spark has been established as an attractive platform for big data analysis, since it manages
to hide most of the complexities related to parallelism, fault tolerance and cluster setting from …

Spark-diy: A framework for interoperable spark operations with high performance block-based data models

S Caíno-Lores, J Carretero, B Nicolae… - 2018 IEEE/ACM 5th …, 2018 - ieeexplore.ieee.org
Today's scientific applications are increasingly relying on a variety of data sources, storage
facilities, and computing infrastructures, and there is a growing demand for data analysis …

Sparkbench–a spark performance testing suite

D Agrawal, A Butt, K Doshi, JL Larriba-Pey, M Li… - … to Big Data to Internet of …, 2016 - Springer
Spark has emerged as an easy to use, scalable, robust and fast system for analytics with a
rapidly growing and vibrant community of users and contributors. It is multipurpose—with …

[图书][B] Big data analytics with Spark: A practitioner's guide to using Spark for large scale data analysis

M Guller - 2015 - Springer
This book is a concise and easy-to-understand tutorial for big data and Spark. It will help you
learn how to use Spark for a variety of big data analytic tasks. It covers everything that you …

Sparkbench: a comprehensive benchmarking suite for in memory data analytic platform spark

M Li, J Tan, Y Wang, L Zhang, V Salapura - Proceedings of the 12th ACM …, 2015 - dl.acm.org
Spark has been increasingly adopted by industries in recent years for big data analysis by
providing a fault tolerant, scalable and easy-to-use in memory abstraction. Moreover, the …

Towards general and efficient online tuning for spark

Y Li, H Jiang, Y Shen, Y Fang, X Yang, D Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
The distributed data analytic system--Spark is a common choice for processing massive
volumes of heterogeneous data, while it is challenging to tune its parameters to achieve …

How data volume affects spark based data analytics on a scale-up server

AJ Awan, M Brorsson, V Vlassov, E Ayguade - Big Data Benchmarks …, 2016 - Springer
Sheer increase in volume of data over the last decade has triggered research in cluster
computing frameworks that enable web enterprises to extract big insights from big data …

SparkBench: a spark benchmarking suite characterizing large-scale in-memory data analytics

M Li, J Tan, Y Wang, L Zhang, V Salapura - Cluster Computing, 2017 - Springer
Spark has been increasingly employed by industries for big data analytics recently, due to its
resilience, scalability and efficient in-memory distributed programming model. Meanwhile …

Towards automatic tuning of apache spark configuration

N Nguyen, MMH Khan, K Wang - 2018 IEEE 11th International …, 2018 - ieeexplore.ieee.org
Apache Spark provides a large number of configuration settings that may be tuned to
improve the performance of specific applications running on the platform. However, it is non …