73 research outputs found
Subgraph Matching Kernels for Attributed Graphs
We propose graph kernels based on subgraph matchings, i.e.
structure-preserving bijections between subgraphs. While recently proposed
kernels based on common subgraphs (Wale et al., 2008; Shervashidze et al.,
2009) in general can not be applied to attributed graphs, our approach allows
to rate mappings of subgraphs by a flexible scoring scheme comparing vertex and
edge attributes by kernels. We show that subgraph matching kernels generalize
several known kernels. To compute the kernel we propose a graph-theoretical
algorithm inspired by a classical relation between common subgraphs of two
graphs and cliques in their product graph observed by Levi (1973). Encouraging
experimental results on a classification task of real-world graphs are
presented.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Significant Subgraph Mining with Multiple Testing Correction
The problem of finding itemsets that are statistically significantly enriched
in a class of transactions is complicated by the need to correct for multiple
hypothesis testing. Pruning untestable hypotheses was recently proposed as a
strategy for this task of significant itemset mining. It was shown to lead to
greater statistical power, the discovery of more truly significant itemsets,
than the standard Bonferroni correction on real-world datasets. An open
question, however, is whether this strategy of excluding untestable hypotheses
also leads to greater statistical power in subgraph mining, in which the number
of hypotheses is much larger than in itemset mining. Here we answer this
question by an empirical investigation on eight popular graph benchmark
datasets. We propose a new efficient search strategy, which always returns the
same solution as the state-of-the-art approach and is approximately two orders
of magnitude faster. Moreover, we exploit the dependence between subgraphs by
considering the effective number of tests and thereby further increase the
statistical power.Comment: 18 pages, 5 figure, accepted to the 2015 SIAM International
Conference on Data Mining (SDM15
- …