Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'

Abstract

The problem of how to compare empirical graphs is an area of great interest within the field of network science. The ability to accurately but efficiently compare graphs has a significant impact in such areas as temporal graph evolution, anomaly detection and protein comparison. The comparison problem is compounded when working with graphs containing millions of anonymous, i.e. unlabelled, vertices and edges. Comparison of two or more graphs is highly computationally expensive. Thus reducing a graph to a much smaller feature set – called a fingerprint, which accurately captures the essence of the graph would be highly desirable. Such an approach would have potential applications outside of graph comparisons, especially in the area of machine learning. This paper introduces a feature extraction based approach for the efficient comparison of large topologically similar, but order varying, unlabelled graph datasets. The approach acts by producing a ‘Graph Fingerprint’ which represents both vertex level and global level topological features from a graph. The approach is shown to be efficient when comparing graphs which are highly topologically similar but order varying. The approach scales linearly with the size and complexity of the graphs being fingerprinted

    Similar works