Scaling distributed machine learning with the parameter server
We propose a parameter server framework for distributed machine learning problems. Both
data and workloads are distributed over worker nodes, while the server nodes maintain …
data and workloads are distributed over worker nodes, while the server nodes maintain …
Apache hadoop yarn: Yet another resource negotiator
VK Vavilapalli, AC Murthy, C Douglas… - Proceedings of the 4th …, 2013 - dl.acm.org
The initial design of Apache Hadoop [1] was tightly focused on running massive,
MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has …
MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has …
Apache tez: A unifying framework for modeling and building data processing applications
The broad success of Hadoop has led to a fast-evolving and diverse ecosystem of
application engines that are building upon the YARN resource management layer. The open …
application engines that are building upon the YARN resource management layer. The open …
Mercury: Hybrid centralized and distributed scheduling in large shared clusters
Datacenter-scale computing for analytics workloads is increasingly common. High
operational costs force heterogeneous applications to share cluster resources for achieving …
operational costs force heterogeneous applications to share cluster resources for achieving …
Trill: A high-performance incremental query processor for diverse analytics
B Chandramouli, J Goldstein, M Barnett… - Proceedings of the …, 2014 - dl.acm.org
This paper introduces Trill--a new query processor for analytics. Trill fulfills a combination of
three requirements for a query processor to serve the diverse big data analytics space:(1) …
three requirements for a query processor to serve the diverse big data analytics space:(1) …
Resource elasticity for large-scale machine learning
Declarative large-scale machine learning (ML) aims at flexible specification of ML algorithms
and automatic generation of hybrid runtime plans ranging from single node, in-memory …
and automatic generation of hybrid runtime plans ranging from single node, in-memory …
[PDF][PDF] Scaling distributed machine learning with system and algorithm co-design
M Li - Santa Clara, CA, USA: Intel, 2017 - reports-archive.adm.cs.cmu.edu
Due to the rapid growth of data and the ever increasing model complexity, which often
manifests itself in the large number of model parameters, today, many important machine …
manifests itself in the large number of model parameters, today, many important machine …
[PDF][PDF] User behavior modeling with large-scale graph analysis
A Beutel - Computer Science Department, Carnegie …, 2016 - reports-archive.adm.cs.cmu.edu
Can we model how fraudsters work to distinguish them from normal users? Can we predict
not just which movie a person will like, but also why? How can we find when a student will …
not just which movie a person will like, but also why? How can we find when a student will …
[PDF][PDF] Dolphin: Runtime optimization for distributed machine learning
Large-scale machine learning (ML) systems are becoming widely used. Typically, these ML
systems run on fixed resources, but it is difficult to find their optimal configurations (eg, how …
systems run on fixed resources, but it is difficult to find their optimal configurations (eg, how …
Performance evaluation of job schedulers on Hadoop YARN
To solve the limitation of Hadoop on scalability, resource sharing, and application support,
the open‐source community proposes the next generation of Hadoop's compute platform …
the open‐source community proposes the next generation of Hadoop's compute platform …