2,195 research outputs found
GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search
The mining of frequent subgraphs from labeled graph data has been studied
extensively. Furthermore, much attention has recently been paid to frequent
pattern mining from graph sequences. A method, called GTRACE, has been proposed
to mine frequent patterns from graph sequences under the assumption that
changes in graphs are gradual. Although GTRACE mines the frequent patterns
efficiently, it still needs substantial computation time to mine the patterns
from graph sequences containing large graphs and long sequences. In this paper,
we propose a new version of GTRACE that enables efficient mining of frequent
patterns based on the principle of a reverse search. The underlying concept of
the reverse search is a general scheme for designing efficient algorithms for
hard enumeration problems. Our performance study shows that the proposed method
is efficient and scalable for mining both long and large graph sequence
patterns and is several orders of magnitude faster than the original GTRACE
High performance subgraph mining in molecular compounds
Structured data represented in the form of graphs arises in
several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining
problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main
aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing
algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network
of workstations
- …