[PDF][PDF] Efficient distribution for deep learning on large graphs

L Hoang, X Chen, H Lee, R Dathathri, G Gill… - update, 2021 - chenxuhao.github.io
update, 2021chenxuhao.github.io
Graph neural networks (GNN) are compute intensive; thus, they are attractive for
acceleration on distributed platforms. We present DeepGalois, an efficient GNN framework
targeting distributed CPUs. DeepGalois is designed for efficient communication of high-
dimensional feature vectors used in GNN. The graph partitioning engine flexibly supports
different partitioning policies and helps the user make tradeoffs among task division,
memory usage, and communication overhead, leading to fast feature learning without …
Abstract
Graph neural networks (GNN) are compute intensive; thus, they are attractive for acceleration on distributed platforms. We present DeepGalois, an efficient GNN framework targeting distributed CPUs. DeepGalois is designed for efficient communication of high-dimensional feature vectors used in GNN. The graph partitioning engine flexibly supports different partitioning policies and helps the user make tradeoffs among task division, memory usage, and communication overhead, leading to fast feature learning without compromising the accuracy. The communication engine minimizes communication overhead by exploiting partitioning invariants and communication bandwidth in modern clusters. Evaluation on a production cluster for the representative reddit and ogbn-products datasets demonstrates that DeepGalois on 32 machines is 2.5× and 2.3× faster than that on 1 machine in average epoch time and time to accuracy, respectively. On 32 machines, DeepGalois outperforms DistDGL by 4× and 8.9× in average epoch time and time to accuracy, respectively.
chenxuhao.github.io
以上显示的是最相近的搜索结果。 查看全部搜索结果