mrMoulder: A recommendation-based adaptive parameter tuning approach for big data processing platform
Nowadays the world has entered the big data era. Big data processing platforms, such as
Hadoop and Spark, are increasingly adopted by many applications, in which there are …
Hadoop and Spark, are increasingly adopted by many applications, in which there are …
Scaling spark in the real world: performance and usability
Apache Spark is one of the most widely used open source processing engines for big data,
with rich language-integrated APIs and a wide range of libraries. Over the past two years …
with rich language-integrated APIs and a wide range of libraries. Over the past two years …
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems
The past decade has seen rapid growth of distributed stream data processing systems.
Under these systems, a stream application is realized as a Directed Acyclic Graph (DAG) of …
Under these systems, a stream application is realized as a Directed Acyclic Graph (DAG) of …
Flare: Native compilation for heterogeneous workloads in Apache Spark
GM Essertel, RY Tahboub, JM Decker… - arXiv preprint arXiv …, 2017 - arxiv.org
The need for modern data analytics to combine relational, procedural, and map-reduce-style
functional processing is widely recognized. State-of-the-art systems like Spark have added …
functional processing is widely recognized. State-of-the-art systems like Spark have added …
Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service
D Ravikumar, A Yeo, Y Zhu, A Lakra… - Proceedings of the …, 2024 - dl.acm.org
The proliferation of big data and analytic workloads has driven the need for cloud compute
and cluster-based job processing. With Apache Spark, users can process terabytes of data …
and cluster-based job processing. With Apache Spark, users can process terabytes of data …
A model driven approach towards improving the performance of apache spark applications
Apache Spark applications often execute in multiple stages where each stage consists of
multiple tasks running in parallel. However, prior efforts noted that the execution time of …
multiple tasks running in parallel. However, prior efforts noted that the execution time of …
[图书][B] Mastering Apache Spark 2. x
R Kienzler - 2017 - books.google.com
Advanced analytics on your Big Data with latest Apache Spark 2. x About This Book An
advanced guide with a combination of instructions and practical examples to extend the …
advanced guide with a combination of instructions and practical examples to extend the …
Spark-diy: A framework for interoperable spark operations with high performance block-based data models
Today's scientific applications are increasingly relying on a variety of data sources, storage
facilities, and computing infrastructures, and there is a growing demand for data analysis …
facilities, and computing infrastructures, and there is a growing demand for data analysis …
Insights on apache spark usage by mining stack overflow questions
LJ Rodríguez, X Wang, J Kuang - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Apache Spark is one of the most popular big data tools. Despite its popularity, there are no
studies regarding its overall usage among software developers. As such, essential …
studies regarding its overall usage among software developers. As such, essential …
Optimizations of Distributed Computing Processes on Apache Spark Platform.
The frequently difficult process of examining large and diverse amounts of information is
known as" big data analysis." The goal is to find insights, such as hidden patterns …
known as" big data analysis." The goal is to find insights, such as hidden patterns …