With the emergence of graph databases, the task of frequent subgraph
discovery has been extensively addressed. Although the proposed approaches in
the literature have made this task feasible, the number of discovered frequent
subgraphs is still very high to be efficiently used in any further exploration.
Feature selection for graph data is a way to reduce the high number of frequent
subgraphs based on exact or approximate structural similarity. However, current
structural similarity strategies are not efficient enough in many real-world
applications, besides, the combinatorial nature of graphs makes it
computationally very costly. In order to select a smaller yet structurally
irredundant set of subgraphs, we propose a novel approach that mines the top-k
topological representative subgraphs among the frequent ones. Our approach
allows detecting hidden structural similarities that existing approaches are
unable to detect such as the density or the diameter of the subgraph. In
addition, it can be easily extended using any user defined structural or
topological attributes depending on the sought properties. Empirical studies on
real and synthetic graph datasets show that our approach is fast and scalable