Reef: Retainable evaluator execution framework

M Li, DG Andersen, JW Park, AJ Smola… - … USENIX Symposium on …, 2014 - usenix.org

We propose a parameter server framework for distributed machine learning problems. Both
data and workloads are distributed over worker nodes, while the server nodes maintain …

被引用次数：2320 相关文章所有 39 个版本

[PDF] carleton.ca

Apache hadoop yarn: Yet another resource negotiator

VK Vavilapalli, AC Murthy, C Douglas… - Proceedings of the 4th …, 2013 - dl.acm.org

The initial design of Apache Hadoop [1] was tightly focused on running massive,
MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has …

被引用次数：2921 相关文章所有 21 个版本

[PDF] ust.hk

Apache tez: A unifying framework for modeling and building data processing applications

B Saha, H Shah, S Seth, G Vijayaraghavan… - Proceedings of the …, 2015 - dl.acm.org

The broad success of Hadoop has led to a fast-evolving and diverse ecosystem of
application engines that are building upon the YARN resource management layer. The open …

被引用次数：298 相关文章所有 12 个版本

[PDF] usenix.org

Mercury: Hybrid centralized and distributed scheduling in large shared clusters

K Karanasos, S Rao, C Curino, C Douglas… - 2015 USENIX Annual …, 2015 - usenix.org

Datacenter-scale computing for analytics workloads is increasingly common. High
operational costs force heterogeneous applications to share cluster resources for achieving …

被引用次数：248 相关文章所有 11 个版本

[PDF] vldb.org

Trill: A high-performance incremental query processor for diverse analytics

B Chandramouli, J Goldstein, M Barnett… - Proceedings of the …, 2014 - dl.acm.org

This paper introduces Trill--a new query processor for analytics. Trill fulfills a combination of
three requirements for a query processor to serve the diverse big data analytics space:(1) …

被引用次数：256 相关文章所有 18 个版本

[PDF] researchgate.net

Resource elasticity for large-scale machine learning

B Huang, M Boehm, Y Tian, B Reinwald… - Proceedings of the …, 2015 - dl.acm.org

Declarative large-scale machine learning (ML) aims at flexible specification of ML algorithms
and automatic generation of hybrid runtime plans ranging from single node, in-memory …

被引用次数：76 相关文章所有 5 个版本

[PDF] cmu.edu

[PDF][PDF] Scaling distributed machine learning with system and algorithm co-design

M Li - Santa Clara, CA, USA: Intel, 2017 - reports-archive.adm.cs.cmu.edu

Due to the rapid growth of data and the ever increasing model complexity, which often
manifests itself in the large number of model parameters, today, many important machine …

被引用次数：55 相关文章所有 6 个版本

[PDF] cmu.edu

[PDF][PDF] User behavior modeling with large-scale graph analysis

A Beutel - Computer Science Department, Carnegie …, 2016 - reports-archive.adm.cs.cmu.edu

Can we model how fraudsters work to distinguish them from normal users? Can we predict
not just which movie a person will like, but also why? How can we find when a student will …

被引用次数：26 相关文章所有 6 个版本

[PDF] illinois.edu

[PDF][PDF] Dolphin: Runtime optimization for distributed machine learning

YSL Lee, M Weimer, Y Yang… - Proc. of ICML ML …, 2016 - bj2.web.engr.illinois.edu

Large-scale machine learning (ML) systems are becoming widely used. Typically, these ML
systems run on fixed resources, but it is difficult to find their optimal configurations (eg, how …

被引用次数：21 相关文章所有 9 个版本

[PDF] arxiv.org

Performance evaluation of job schedulers on Hadoop YARN

JC Lin, MC Lee - Concurrency and Computation: Practice and …, 2016 - Wiley Online Library

To solve the limitation of Hadoop on scalability, resource sharing, and application support,
the open‐source community proposes the next generation of Hadoop's compute platform …

被引用次数：24 相关文章所有 7 个版本