Sandslash: a two-level framework for efficient graph pattern mining

X Chen, R Dathathri, G Gill, L Hoang… - Proceedings of the ACM …, 2021 - dl.acm.org
Proceedings of the ACM International Conference on Supercomputing, 2021dl.acm.org
Graph pattern mining (GPM) is a key building block in diverse applications, including
bioinformatics, chemical engineering, social network analysis, recommender systems and
security. Existing GPM frameworks either provide high-level interfaces for productivity at the
cost of expressiveness or provide low-level interfaces that can express a wide variety of
GPM algorithms at the cost of increased programming complexity. Moreover, existing
systems lack the flexibility to explore combinations of optimizations to achieve performance …
Graph pattern mining (GPM) is a key building block in diverse applications, including bioinformatics, chemical engineering, social network analysis, recommender systems and security. Existing GPM frameworks either provide high-level interfaces for productivity at the cost of expressiveness or provide low-level interfaces that can express a wide variety of GPM algorithms at the cost of increased programming complexity. Moreover, existing systems lack the flexibility to explore combinations of optimizations to achieve performance competitive with hand-optimized applications.
We present Sandslash, an in-memory graph pattern mining framework that uses a novel programming interface to support productive, expressive, and efficient GPM on large graphs. Sandslash provides a high-level API that needs only a specification of the GPM problem from the user, and the system implements fast subgraph enumeration, provides efficient data structures, and applies high-level optimizations automatically. To achieve performance competitive with expert-optimized implementations, Sandslash also provides a low-level API that allows users to express algorithm-specific optimizations. This enables Sandslash to support both high-productivity and high-efficiency without losing expressiveness. We evaluate Sandslash on shared-memory machines using five GPM applications and a wide range of large real-world graphs. Experimental results demonstrate that applications written using Sandslash high-level or low-level API outperform those in state-of-the-art GPM systems AutoMine, Pangolin, and Peregrine on average by 13.8x, 7.9x, and 5.4x, respectively. We also show that these Sandslash applications outperform expert-optimized GPM implementations by 2.3x on average with less programming effort.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果