2 research outputs found
Recommended from our members
Improved parallel and sequential walking tree methods for biological string alignments
Approximate string matching is commonly used to align genetic sequences (DNA or RNA) to determine their shared
characteristics. Most genetic string matching methods are based on the edit-distance model, which does not provide
alignments for inversions and translocations. Recently, a heuristic called the Walking Tree Method [2, 3] has been
developed to solve this problem. Unlike other heuristics, it can handle more than one level of inversion, i.e.,
inversions within inversions. Furthermore, it tends to capture the matched strings’ genes while other heuristics fail.
There are two versions of the original walking tree heuristics: the score version gives only the alignment score, the
alignment version gives both the score and the alignment mapping between the strings. The score version runs in
quadratic time and uses linear space while the alignment version uses an extra log factor for time and space.
In this paper, we will briefly describe the walking tree method and the original sequential and parallel algorithms.
We will explain why different parallel algorithms are needed for a network of workstations rather than the original
algorithm which worked well on a symmetric multi-processor. Our improved parallel method also led to a quadratic
time sequential algorithm that uses less space. We give an example of our parallel method. We describe several
experiments that show speedup linear in the number of processors, but eventual drop off in speedup as the
communication network saturates. For big enough strings, we found linear speedup for all processors we had
available. These results suggest that our improved parallel method will scale up as both the size of the problem and
the number of processors increase. We include two figures that use real biological data and show that the walking
tree methods can find translocations and inversions in DNA sequences and also discover unknown genes.Keywords: heuristic, inversions, sequence alignment, parallel algorithm, walking tree, dynamic programming, genome alignment, translocation