Linear time algorithms for finding and representing all the tandem repeats in a string

Abstract

AbstractA tandem repeat (or square) is a string αα, where α is a non-empty string. We present an O(|S|)-time algorithm that operates on the suffix tree T(S) for a string S, finding and marking the endpoint in T(S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats

Similar works

This paper was published in Elsevier - Publisher Connector .

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.