GAMOT: AN EFFICIENT GENETIC ALGORITHM FOR FINDING CHALLENGING MOTIFS IN DNA SEQUENCES

Abstract

Weak signals that mark transcription factor binding sites involved in gene regulation are considered to be challenging motifs. Identifying these motifs in unaligned DNA sequences is a computationally hard problem which requires efficient algorithms. Genetic Algorithms (GA), inspired from evolution in nature, are a class of stochastic search algorithms which have been applied successfully to many computationally hard problems, including regulatory site prediction. In this paper, we propose GAMOT, an efficient GA for solving Planted (l, d)-Motif Problems as introduced by Pevzner and Sze. We show empirically that our algorithm is not only able to solve the challenging problem instances with short motifs such as (14,4) and (15,4) efficiently but also that it is able to solve problems with longer motifs such as (20,7), (30,11) and (40,15). GAMOT can find the planted motifs in near-linear computational time thanks to an additional step which creates a highly fit population of solutions even before the evolutionary process is applied. We present a comparison of our results with some of the state-of-the-art algorithms such as VAS and PROJECTION. 1

    Similar works

    Full text

    thumbnail-image

    Available Versions