research

Towards an Efficient Discovery of the Topological Representative Subgraphs

Abstract

With the emergence of graph databases, the task of frequent subgraph discovery has been extensively addressed. Although the proposed approaches in the literature have made this task feasible, the number of discovered frequent subgraphs is still very high to be efficiently used in any further exploration. Feature selection for graph data is a way to reduce the high number of frequent subgraphs based on exact or approximate structural similarity. However, current structural similarity strategies are not efficient enough in many real-world applications, besides, the combinatorial nature of graphs makes it computationally very costly. In order to select a smaller yet structurally irredundant set of subgraphs, we propose a novel approach that mines the top-k topological representative subgraphs among the frequent ones. Our approach allows detecting hidden structural similarities that existing approaches are unable to detect such as the density or the diameter of the subgraph. In addition, it can be easily extended using any user defined structural or topological attributes depending on the sought properties. Empirical studies on real and synthetic graph datasets show that our approach is fast and scalable

    Similar works