Informative Motif Detection Using Data Mining

Abstract

Abstract: Motif finding in biological sequences is a fundamental problem in computational biology with important applications in understanding gene regulation, protein family identification and determination of functionally and structurally important identities. The large amounts of biological data let us solve the problem of discovering patterns in biological sequences computationally. In this research, we have developed an approach using a method of data mining to detect frequent residue informative motifs that are high in information content. The proposed approach modifies an existing method based on Apriori algorithm by using the Frequent Pattern tree (FP-tree) algorithm of data mining method. This method can efficiently detect novel motifs in biological sequences based on information content of the motifs and shows better performance than the existing method. Experiments on real biological sequence data sets demonstrate the effectiveness of the method

    Similar works