High performance subgraph mining in molecular compounds

Di Fatta, Giuseppe; Berthold, Michael R.

research

oai:centaur.reading.ac.uk:6153

High performance subgraph mining in molecular compounds

Authors: Giuseppe Di Fatta
Michael R. Berthold
Publication date: 4 October 2005
Publisher: Springer
Doi

Abstract

Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations

Similar works

Full text

Open in the Core reader

Download PDF

Central Archive at the University of Reading

oai:centaur.reading.ac.uk:6153

Last time updated on 01/07/2012

This paper was published in Central Archive at the University of Reading.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.