Article thumbnail
Location of Repository

A Minimum Cost Process in Searching for a Set of Similar DNA Sequences

By M Yazid M Saman, M Nordin A Rahman, Aziz Ahmad and A Osman M Tap


Abstract:- DNA sequence alignment for similarity search is a vital topic in bioinformatics algorithm development. Computational searching for a set of DNA sequences, S, that similar to a query sequence, q, in a large scale of DNA databases is very complicated and requires high processors performance as well as large memory spaces. Frequently, quadratic running time complexity dynamic programming algorithms used to produce a local optimal sequence alignment. However, this algorithm is time consuming in dealing with a long DNA sequences. By means of local alignment, this paper presents a framework to search a set of similar sequences in a large scale of DNA databases with reliable output and minimum cost. The Knuth-Morris-Pratt algorithm (KMP) is adapted and acts as a filtering mechanism before exhaustive dynamic programming is applied. The KMP algorithm is used to scan the generated patterns from query sequence to the sequences in databases. This filtering process generates scores which are used for ranking purposes. The Smith-Waterman algorithm then is applied to each sequences starting from the top of the constructed ranking. The paper also discusses the optimal patterns length that highly appropriate for the database scanning process. The experiment results show that the filtering mechanism proposes discard irrelevant sequences. Therefore, the time for searching and retrieving the set of similar sequences from databases to the query is minimized

Year: 2006
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.