9,771 research outputs found
QuateXelero : an accelerated exact network motif detection algorithm
Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network
Significant Subgraph Mining with Multiple Testing Correction
The problem of finding itemsets that are statistically significantly enriched
in a class of transactions is complicated by the need to correct for multiple
hypothesis testing. Pruning untestable hypotheses was recently proposed as a
strategy for this task of significant itemset mining. It was shown to lead to
greater statistical power, the discovery of more truly significant itemsets,
than the standard Bonferroni correction on real-world datasets. An open
question, however, is whether this strategy of excluding untestable hypotheses
also leads to greater statistical power in subgraph mining, in which the number
of hypotheses is much larger than in itemset mining. Here we answer this
question by an empirical investigation on eight popular graph benchmark
datasets. We propose a new efficient search strategy, which always returns the
same solution as the state-of-the-art approach and is approximately two orders
of magnitude faster. Moreover, we exploit the dependence between subgraphs by
considering the effective number of tests and thereby further increase the
statistical power.Comment: 18 pages, 5 figure, accepted to the 2015 SIAM International
Conference on Data Mining (SDM15
Subgraph covers -- An information theoretic approach to motif analysis in networks
Many real world networks contain a statistically surprising number of certain
subgraphs, called network motifs. In the prevalent approach to motif analysis,
network motifs are detected by comparing subgraph frequencies in the original
network with a statistical null model. In this paper we propose an alternative
approach to motif analysis where network motifs are defined to be connectivity
patterns that occur in a subgraph cover that represents the network using
minimal total information. A subgraph cover is defined to be a set of subgraphs
such that every edge of the graph is contained in at least one of the subgraphs
in the cover. Some recently introduced random graph models that can incorporate
significant densities of motifs have natural formulations in terms of subgraph
covers and the presented approach can be used to match networks with such
models. To prove the practical value of our approach we also present a
heuristic for the resulting NP-hard optimization problem and give results for
several real world networks.Comment: 10 pages, 7 tables, 1 Figur
- …