Explainable and efficient link prediction in real-world network data

JE van Engelen, HD Boekhout, FW Takes - Advances in Intelligent Data …, 2016 - Springer
Advances in Intelligent Data Analysis XV: 15th International Symposium, IDA …, 2016Springer
Data that involves some sort of relationship or interaction can be represented, modelled and
analyzed using the notion of a network. To understand the dynamics of networks, the link
prediction problem is concerned with predicting the evolution of the topology of a network
over time. Previous work in this direction has largely focussed on finding an extensive set of
features capable of predicting the formation of a link, often within some domain-specific
context. This sometimes results in a “black box” type of approach in which it is unclear how …
Abstract
Data that involves some sort of relationship or interaction can be represented, modelled and analyzed using the notion of a network. To understand the dynamics of networks, the link prediction problem is concerned with predicting the evolution of the topology of a network over time. Previous work in this direction has largely focussed on finding an extensive set of features capable of predicting the formation of a link, often within some domain-specific context. This sometimes results in a “black box” type of approach in which it is unclear how the (often computationally expensive) features contribute to the accuracy of the final predictor. This paper counters these problems by categorising the large set of proposed link prediction features based on their topological scope, and showing that the contribution of particular categories of features can actually be explained by simple structural properties of the network. An approach called the Efficient Feature Set is presented that uses a limited but explainable set of computationally efficient features that within each scope captures the essential network properties. Its performance is experimentally verified using a large number of diverse real-world network datasets. The result is a generic approach suitable for consistently predicting links with high accuracy.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果