相关文章- 学术资源搜索

Hyperdrive: Exploring hyperparameters with pop scheduling

J Rasley, Y He, F Yan, O Ruwase… - Proceedings of the 18th …, 2017 - dl.acm.org

The quality of machine learning (ML) and deep learning (DL) models are very sensitive to
many different adjustable parameters that are set before training even begins, commonly …

被引用次数：63 相关文章所有 5 个版本

[PDF] helsinki.fi

Cost-effective resource provisioning for spark workloads

Y Chen, J Lu, C Chen, M Hoque… - Proceedings of the 28th …, 2019 - dl.acm.org

Spark is one of the prevalent big data analytical platforms. Configuring proper resource
provision for Spark jobs is challenging but essential for organizations to save time, achieve …

被引用次数：20 相关文章所有 7 个版本

[PDF] usenix.org

Flare: Optimizing Apache Spark with Native Compilation for {Scale-Up} Architectures and {Medium-Size} Data

G Essertel, R Tahboub, J Decker, K Brown… - … USENIX Symposium on …, 2018 - usenix.org

In recent years, Apache Spark has become the de facto standard for big data processing.
Spark has enabled a wide audience of users to process petabyte-scale workloads due to its …

被引用次数：70 相关文章所有 11 个版本

SparkBench: a spark benchmarking suite characterizing large-scale in-memory data analytics

M Li, J Tan, Y Wang, L Zhang, V Salapura - Cluster Computing, 2017 - Springer

Spark has been increasingly employed by industries for big data analytics recently, due to its
resilience, scalability and efficient in-memory distributed programming model. Meanwhile …

被引用次数：47 相关文章所有 4 个版本

[PDF] github.io

Model averaging in distributed machine learning: a case study with Apache Spark

Y Guo, Z Zhang, J Jiang, W Wu, C Zhang, B Cui, J Li - The VLDB Journal, 2021 - Springer

The increasing popularity of Apache Spark has attracted many users to put their data into its
ecosystem. On the other hand, it has been witnessed in the literature that Spark is slow …

被引用次数：13 相关文章所有 7 个版本

[PDF] nsf.gov

Elastic executor provisioning for iterative workloads on apache spark

D Yang, W Rang, D Cheng, Y Wang… - … Conference on Big …, 2019 - ieeexplore.ieee.org

In memory data analytic frameworks like Apache Spark are employed by an increasing
number of diverse applications-such as machine learning, graph computation, and scientific …

被引用次数：9 相关文章所有 6 个版本

Optimizing performance of Real-Time Big Data stateful streaming applications on Cloud

A Gupta, S Jain - 2022 IEEE International Conference on Big …, 2022 - ieeexplore.ieee.org

Exponential growth in the volume of data generated over the last decade has triggered
massive research and adoption of distributed big data analytics platforms. In real-time …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Tuneful: An online significance-aware configuration tuner for big data analytics

A Fekry, L Carata, T Pasquier, A Rice… - arXiv preprint arXiv …, 2020 - arxiv.org

Distributed analytics engines such as Spark are a common choice for processing extremely
large datasets. However, finding good configurations for these systems remains challenging …

被引用次数：20 相关文章所有 3 个版本

[PDF] github.io

Towards automatic tuning of apache spark configuration

N Nguyen, MMH Khan, K Wang - 2018 IEEE 11th International …, 2018 - ieeexplore.ieee.org

Apache Spark provides a large number of configuration settings that may be tuned to
improve the performance of specific applications running on the platform. However, it is non …

被引用次数：38 相关文章所有 3 个版本

[PDF] arxiv.org

How data volume affects spark based data analytics on a scale-up server

AJ Awan, M Brorsson, V Vlassov, E Ayguade - Big Data Benchmarks …, 2016 - Springer

Sheer increase in volume of data over the last decade has triggered research in cluster
computing frameworks that enable web enterprises to extract big insights from big data …

被引用次数：29 相关文章所有 12 个版本