DistDGL: Distributed graph neural network training for billion-scale graphs
Graph neural networks (GNN) have shown great success in learning from graph-structured
data. They are widely used in various applications, such as recommendation, fraud …
data. They are widely used in various applications, such as recommendation, fraud …
Pangolin: An efficient and flexible graph mining system on cpu and gpu
There is growing interest in graph pattern mining (GPM) problems such as motif counting.
GPM systems have been developed to provide unified interfaces for programming …
GPM systems have been developed to provide unified interfaces for programming …
Alleviating irregularity in graph analytics acceleration: A hardware/software co-design approach
Graph analytics is an emerging application which extracts insights by processing large
volumes of highly connected data, namely graphs. The parallel processing of graphs has …
volumes of highly connected data, namely graphs. The parallel processing of graphs has …
Depgraph: A dependency-driven accelerator for efficient iterative graph processing
Many graph processing systems have been recently developed for many-core processors.
However, for iterative graph processing, due to the dependencies between vertices' states …
However, for iterative graph processing, due to the dependencies between vertices' states …
Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources
Address translation is a performance bottleneck in data-intensive workloads due to large
datasets and irregular access patterns that lead to frequent high-latency page table walks …
datasets and irregular access patterns that lead to frequent high-latency page table walks …
The Intel programmable and integrated unified memory architecture graph analytics processor
S Aananthakrishnan, S Abedin, V Cavé… - IEEE Micro, 2023 - ieeexplore.ieee.org
High-performance large-scale graph analytics are essential to timely analyze relationships
in big datasets. Conventional processor architectures suffer from inefficient resource usage …
in big datasets. Conventional processor architectures suffer from inefficient resource usage …
The First Direct Mesh-to-Mesh Photonic Fabric
J Howard, JB Fryman, S Abedin - IEEE Micro, 2024 - ieeexplore.ieee.org
Intel developed the Programmable Integrated Unified Memory Architecture (PIUMA) to
address inefficiencies seen in conventional processor architectures for at-scale sparse …
address inefficiencies seen in conventional processor architectures for at-scale sparse …
Planting trees for scalable and efficient canonical hub labeling
Point-to-Point Shortest Distance (PPSD) query is a crucial primitive in graph database
applications. Hub labeling algorithms compute a labeling that converts a PPSD query into a …
applications. Hub labeling algorithms compute a labeling that converts a PPSD query into a …
Saga-bench: Software and hardware characterization of streaming graph analytics workloads
Many application scenarios such as social network analysis and real-time financial fraud
detection involve performing batched updates and analytics on a time-evolving or streaming …
detection involve performing batched updates and analytics on a time-evolving or streaming …
Pinning Page Structure Entries to Last-Level Cache for Fast Address Translation
As the memory footprint of emerging applications continues to increase, the address
translation becomes a critical performance bottleneck owing to frequent misses on the …
translation becomes a critical performance bottleneck owing to frequent misses on the …