Dorylus: Affordable, scalable, and accurate {GNN} training with distributed {CPU} servers and serverless threads

J Thorpe, Y Qiao, J Eyolfson, S Teng, G Hu… - … USENIX Symposium on …, 2021 - usenix.org
A graph neural network (GNN) enables deep learning on structured graph data. There are
two major GNN training obstacles: 1) it relies on high-end servers with many GPUs which …

{NeuGraph}: Parallel deep neural network computation on large graphs

L Ma, Z Yang, Y Miao, J Xue, M Wu, L Zhou… - 2019 USENIX Annual …, 2019 - usenix.org
Recent deep learning models have moved beyond low dimensional regular grids such as
image, video, and speech, to high-dimensional graph-structured data, such as social …

Powerlyra: Differentiated graph computation and partitioning on skewed graphs

R Chen, J Shi, Y Chen, B Zang, H Guan… - ACM Transactions on …, 2019 - dl.acm.org
Natural graphs with skewed distributions raise unique challenges to distributed graph
computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” …

EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks

S Liang, Y Wang, C Liu, L He… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Graph neural networks (GNNs) emerge as a powerful approach to process non-euclidean
data structures and have been proved powerful in various application domains such as …

Kv-direct: High-performance in-memory key-value store with programmable nic

B Li, Z Ruan, W Xiao, Y Lu, Y Xiong, A Putnam… - Proceedings of the 26th …, 2017 - dl.acm.org
Performance of in-memory key-value store (KVS) continues to be of great importance as
modern KVS goes beyond the traditional object-caching workload and becomes a key …

Chaos: Scale-out graph processing from secondary storage

A Roy, L Bindschaedler, J Malicevic… - Proceedings of the 25th …, 2015 - dl.acm.org
Chaos scales graph processing from secondary storage to multiple machines in a cluster.
Earlier systems that process graphs from secondary storage are restricted to a single …

Mosaic: Processing a trillion-edge graph on a single machine

S Maass, C Min, S Kashyap, W Kang… - Proceedings of the …, 2017 - dl.acm.org
Processing a one trillion-edge graph has recently been demonstrated by distributed graph
engines running on clusters of tens to hundreds of nodes. In this paper, we employ a single …

ACC: Automatic ECN tuning for high-speed datacenter networks

S Yan, X Wang, X Zheng, Y Xia, D Liu… - Proceedings of the 2021 …, 2021 - dl.acm.org
For the widely deployed ECN-based congestion control schemes, the marking threshold is
the key to deliver high bandwidth and low latency. However, due to traffic dynamics in the …

Deconstructing {RDMA-enabled} distributed transactions: Hybrid is better!

X Wei, Z Dong, R Chen, H Chen - 13th USENIX Symposium on …, 2018 - usenix.org
There is currently an active debate on which RDMA primitive (ie, one-sided or two-sided) is
optimal for distributed transactions. Such a debate has led to a number of optimizations …

GraphOne A Data Store for Real-time Analytics on Evolving Graphs

P Kumar, HH Huang - ACM Transactions on Storage (TOS), 2020 - dl.acm.org
There is a growing need to perform a diverse set of real-time analytics (batch and stream
analytics) on evolving graphs to deliver the values of big data to users. The key requirement …