相关文章- 学术资源搜索

[图书][B] Spark: Big data cluster computing in production

I Ganelin, E Orhian, K Sasaki, B York - 2016 - books.google.com

Production-targeted Spark guidance with real-world use cases Spark: Big Data Cluster
Computing in Production goes beyond general Spark overviews to provide targeted …

被引用次数：23 相关文章所有 2 个版本

[PDF] mlr.press

Descending through a crowded valley-benchmarking deep learning optimizers

RM Schmidt, F Schneider… - … Conference on Machine …, 2021 - proceedings.mlr.press

Choosing the optimizer is considered to be among the most crucial design decisions in deep
learning, and it is not an easy one. The growing literature now lists hundreds of optimization …

被引用次数：184 相关文章所有 7 个版本

[PDF] osti.gov

A heterogeneity-aware task scheduler for spark

L Xu, AR Butt, SH Lim, R Kannan - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Big data processing systems such as Spark are employed in an increasing number of
diverse applications-such as machine learning, graph computation, and scientific computing …

被引用次数：26 相关文章所有 8 个版本

[PDF] mlr.press

Open source vizier: Distributed infrastructure and api for reliable and flexible blackbox optimization

X Song, S Perel, C Lee, G Kochanski… - International …, 2022 - proceedings.mlr.press

Vizier is the de-facto blackbox optimization service across Google, having optimized some of
Google's largest products and research efforts. To operate at the scale of tuning thousands …

被引用次数：26 相关文章所有 5 个版本

[PDF] arxiv.org

Hyper-tune: Towards efficient hyper-parameter tuning at scale

Y Li, Y Shen, H Jiang, W Zhang, J Li, J Liu… - arXiv preprint arXiv …, 2022 - arxiv.org

The ever-growing demand and complexity of machine learning are putting pressure on
hyper-parameter tuning systems: while the evaluation cost of models continues to increase …

被引用次数：18 相关文章所有 6 个版本

Effective data management strategy and RDD weight cache replacement strategy in Spark

K Jiang, S Du, F Zhao, Y Huang, C Li, Y Luo - Computer Communications, 2022 - Elsevier

With the dramatic increase in internet users and their demand for real-time network
performance, Spark has distributed computing environment has emerged. It is widely used …

被引用次数：6 相关文章

[PDF] toronto.edu

Optimizing shuffle in wide-area data analytics

S Liu, H Wang, B Li - 2017 IEEE 37th International Conference …, 2017 - ieeexplore.ieee.org

As increasingly large volumes of raw data are generated at geographically distributed
datacenters, they need to be efficiently processed by data analytic jobs spanning multiple …

被引用次数：21 相关文章所有 4 个版本

[PDF] stonybrook.edu

Adaptively accelerating map-reduce/spark with GPUs: A case study

KR Jayaram, A Gandhi, H Xin… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org

In this paper, we propose and evaluate a simple mechanism to accelerate iterative machine
learning algorithms implemented in Hadoop map-reduce (stock), and Apache Spark. In …

被引用次数：8 相关文章所有 4 个版本

[图书][B] Data Analytics with Spark Using Python

J Aven - 2018 - books.google.com

Spark is at the heart of today's Big Data revolution, helping data professionals supercharge
efficiency and performance in a wide range of data processing and analytics tasks. In this …

被引用次数：10 相关文章所有 2 个版本

SMBSP: a self-tuning approach using machine learning to improve performance of spark in big data processing

MA Rahman, J Hossen… - 2018 7th International …, 2018 - ieeexplore.ieee.org

Apache Spark, popularly known for big data processing capability, is a distributed open-
source platform that uses the concept of distributed memory to facilitate big data processing …

被引用次数：13 相关文章