4,836 research outputs found
An Efficient Implementation of a Subgraph Isomorphism Algorithm for GPUs.
The subgraph isomorphism problem is a computational task that applies to a wide range of today's applications, ranging from the understanding of biological networks to the analysis of social networks. Even though different implementations for CPUs have been proposed to improve the efficiency of such a graph search algorithm, they have shown to be bounded by the intrinsic sequential nature of the algorithm. More recently, graphics processing units (GPUs) have become widespread platforms that provide massive parallelism at low cost. Nevertheless, parallelizing any efficient and optimized sequential algorithm for subgraph isomorphism on many-core architectures is a very challenging task. This article presents
, a parallel implementation of the subgraph isomorphism algorithm for GPUs. Different strategies are implemented in
to deal with the space complexity of the graph searching algorithm, the potential workload imbalance, and the thread divergence involved by the non-homogeneity of actual graphs. The paper presents the results obtained on several graphs of different sizes and characteristics to understand the efficiency of the proposed approach
Between Subgraph Isomorphism and Maximum Common Subgraph
When a small pattern graph does not occur inside a larger target graph, we can ask how to find "as much of the pattern as possible" inside the target graph. In general, this is known as the maximum common subgraph problem, which is much more computationally challenging in practice than subgraph isomorphism. We introduce a restricted alternative, where we ask if all but k vertices from the pattern can be found in the target graph. This allows for the development of slightly weakened forms of certain invariants from subgraph isomorphism which are based upon degree and number of paths. We show that when k is small, weakening the invariants still retains much of their effectiveness. We are then able to solve this problem on the standard problem instances used to benchmark subgraph isomorphism algorithms, despite these instances being too large for current maximum common subgraph algorithms to handle. Finally, by iteratively increasing k, we obtain an algorithm which is also competitive for the maximum common subgraph
Shared Memory Parallel Subgraph Enumeration
The subgraph enumeration problem asks us to find all subgraphs of a target
graph that are isomorphic to a given pattern graph. Determining whether even
one such isomorphic subgraph exists is NP-complete---and therefore finding all
such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration
has applications in many fields, including biochemistry and social networks,
and interestingly the fastest algorithms for solving the problem for
biochemical inputs are sequential. Since they depend on depth-first tree
traversal, an efficient parallelization is far from trivial. Nevertheless,
since important applications produce data sets with increasing difficulty,
parallelism seems beneficial.
We thus present here a shared-memory parallelization of the state-of-the-art
subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs)
by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing
and our implementation demonstrates a significant speedup on real-world
biochemical data---despite a highly irregular data access pattern. We also
improve RI-DS by pruning the search space better; this further improves the
empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel
/ Distributed Computing and Optimization (PDCO 2017
Pattern matching and pattern discovery algorithms for protein topologies
We describe algorithms for pattern matching and pattern
learning in TOPS diagrams (formal descriptions of protein topologies).
These problems can be reduced to checking for subgraph isomorphism
and finding maximal common subgraphs in a restricted class of ordered
graphs. We have developed a subgraph isomorphism algorithm for
ordered graphs, which performs well on the given set of data. The
maximal common subgraph problem then is solved by repeated
subgraph extension and checking for isomorphisms. Despite the
apparent inefficiency such approach gives an algorithm with time
complexity proportional to the number of graphs in the input set and is
still practical on the given set of data. As a result we obtain fast
methods which can be used for building a database of protein
topological motifs, and for the comparison of a given protein of known
secondary structure against a motif database
Efficient mining of discriminative molecular fragments
Frequent pattern discovery in structured data is receiving
an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset
- …