[图书][B] Spark: Big data cluster computing in production
I Ganelin, E Orhian, K Sasaki, B York - 2016 - books.google.com
Production-targeted Spark guidance with real-world use cases Spark: Big Data Cluster
Computing in Production goes beyond general Spark overviews to provide targeted …
Computing in Production goes beyond general Spark overviews to provide targeted …
Descending through a crowded valley-benchmarking deep learning optimizers
RM Schmidt, F Schneider… - … Conference on Machine …, 2021 - proceedings.mlr.press
Choosing the optimizer is considered to be among the most crucial design decisions in deep
learning, and it is not an easy one. The growing literature now lists hundreds of optimization …
learning, and it is not an easy one. The growing literature now lists hundreds of optimization …
A heterogeneity-aware task scheduler for spark
Big data processing systems such as Spark are employed in an increasing number of
diverse applications-such as machine learning, graph computation, and scientific computing …
diverse applications-such as machine learning, graph computation, and scientific computing …
Open source vizier: Distributed infrastructure and api for reliable and flexible blackbox optimization
Vizier is the de-facto blackbox optimization service across Google, having optimized some of
Google's largest products and research efforts. To operate at the scale of tuning thousands …
Google's largest products and research efforts. To operate at the scale of tuning thousands …
Hyper-tune: Towards efficient hyper-parameter tuning at scale
The ever-growing demand and complexity of machine learning are putting pressure on
hyper-parameter tuning systems: while the evaluation cost of models continues to increase …
hyper-parameter tuning systems: while the evaluation cost of models continues to increase …
Effective data management strategy and RDD weight cache replacement strategy in Spark
K Jiang, S Du, F Zhao, Y Huang, C Li, Y Luo - Computer Communications, 2022 - Elsevier
With the dramatic increase in internet users and their demand for real-time network
performance, Spark has distributed computing environment has emerged. It is widely used …
performance, Spark has distributed computing environment has emerged. It is widely used …
Optimizing shuffle in wide-area data analytics
As increasingly large volumes of raw data are generated at geographically distributed
datacenters, they need to be efficiently processed by data analytic jobs spanning multiple …
datacenters, they need to be efficiently processed by data analytic jobs spanning multiple …
Adaptively accelerating map-reduce/spark with GPUs: A case study
In this paper, we propose and evaluate a simple mechanism to accelerate iterative machine
learning algorithms implemented in Hadoop map-reduce (stock), and Apache Spark. In …
learning algorithms implemented in Hadoop map-reduce (stock), and Apache Spark. In …
[图书][B] Data Analytics with Spark Using Python
J Aven - 2018 - books.google.com
Spark is at the heart of today's Big Data revolution, helping data professionals supercharge
efficiency and performance in a wide range of data processing and analytics tasks. In this …
efficiency and performance in a wide range of data processing and analytics tasks. In this …
SMBSP: a self-tuning approach using machine learning to improve performance of spark in big data processing
Apache Spark, popularly known for big data processing capability, is a distributed open-
source platform that uses the concept of distributed memory to facilitate big data processing …
source platform that uses the concept of distributed memory to facilitate big data processing …