DistDGL: Distributed graph neural network training for billion-scale graphs

D Zheng, C Ma, M Wang, J Zhou, Q Su… - 2020 IEEE/ACM 10th …, 2020 - ieeexplore.ieee.org
Graph neural networks (GNN) have shown great success in learning from graph-structured
data. They are widely used in various applications, such as recommendation, fraud …

Pangolin: An efficient and flexible graph mining system on cpu and gpu

X Chen, R Dathathri, G Gill, K Pingali - Proceedings of the VLDB …, 2020 - dl.acm.org
There is growing interest in graph pattern mining (GPM) problems such as motif counting.
GPM systems have been developed to provide unified interfaces for programming …

Alleviating irregularity in graph analytics acceleration: A hardware/software co-design approach

M Yan, X Hu, S Li, A Basak, H Li, X Ma… - Proceedings of the …, 2019 - dl.acm.org
Graph analytics is an emerging application which extracts insights by processing large
volumes of highly connected data, namely graphs. The parallel processing of graphs has …

Depgraph: A dependency-driven accelerator for efficient iterative graph processing

Y Zhang, X Liao, H Jin, L He, B He… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Many graph processing systems have been recently developed for many-core processors.
However, for iterative graph processing, due to the dependencies between vertices' states …

Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources

K Kanellopoulos, HC Nam, N Bostanci, R Bera… - Proceedings of the 56th …, 2023 - dl.acm.org
Address translation is a performance bottleneck in data-intensive workloads due to large
datasets and irregular access patterns that lead to frequent high-latency page table walks …

The Intel programmable and integrated unified memory architecture graph analytics processor

S Aananthakrishnan, S Abedin, V Cavé… - IEEE Micro, 2023 - ieeexplore.ieee.org
High-performance large-scale graph analytics are essential to timely analyze relationships
in big datasets. Conventional processor architectures suffer from inefficient resource usage …

The First Direct Mesh-to-Mesh Photonic Fabric

J Howard, JB Fryman, S Abedin - IEEE Micro, 2024 - ieeexplore.ieee.org
Intel developed the Programmable Integrated Unified Memory Architecture (PIUMA) to
address inefficiencies seen in conventional processor architectures for at-scale sparse …

Planting trees for scalable and efficient canonical hub labeling

K Lakhotia, Q Dong, R Kannan, V Prasanna - arXiv preprint arXiv …, 2019 - arxiv.org
Point-to-Point Shortest Distance (PPSD) query is a crucial primitive in graph database
applications. Hub labeling algorithms compute a labeling that converts a PPSD query into a …

Saga-bench: Software and hardware characterization of streaming graph analytics workloads

A Basak, J Lin, R Lorica, X Xie, Z Chishti… - … Analysis of Systems …, 2020 - ieeexplore.ieee.org
Many application scenarios such as social network analysis and real-time financial fraud
detection involve performing batched updates and analytics on a time-evolving or streaming …

Pinning Page Structure Entries to Last-Level Cache for Fast Address Translation

O Kwon, Y Lee, S Hong - IEEE Access, 2022 - ieeexplore.ieee.org
As the memory footprint of emerging applications continues to increase, the address
translation becomes a critical performance bottleneck owing to frequent misses on the …