Distributed mining of molecular fragments

Di Fatta, Giuseppe; Berthold, Michael R.

research

oai:centaur.reading.ac.uk:6152

Distributed mining of molecular fragments

Authors: Giuseppe Di Fatta
Michael R. Berthold
Publication date
Publisher

Abstract

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment

Similar works

Full text

Open in the Core reader

Download PDF

Central Archive at the University of Reading

oai:centaur.reading.ac.uk:6152

Last time updated on 01/07/2012

This paper was published in Central Archive at the University of Reading.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.