120 research outputs found

    Linear-time Computation of Minimal Absent Words Using Suffix Array

    Get PDF
    An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their computation provides a fast alternative for measuring approximation in sequence comparison. There exists an O(n)-time and O(n)-space algorithm for computing all minimal absent words on a fixed-sized alphabet based on the construction of suffix automata (Crochemore et al., 1998). No implementation of this algorithm is publicly available. There also exists an O(n^2)-time and O(n)-space algorithm for the same problem based on the construction of suffix arrays (Pinho et al., 2009). An implementation of this algorithm was also provided by the authors and is currently the fastest available. In this article, we bridge this unpleasant gap by presenting an O(n)-time and O(n)-space algorithm for computing all minimal absent words based on the construction of suffix arrays. Experimental results using real and synthetic data show that the respective implementation outperforms the one by Pinho et al

    Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform

    Get PDF
    International audienceA degenerate or indeterminate string on an alphabet ÎŁ is a sequence of non-empty subsets of ÎŁ. Given a degenerate string t of length n, we present a new method based on the Burrows--Wheeler transform for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet ÎŁ. Furthermore, it is a hybrid pattern-matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search complexity time is O(qm2). Experimental results show that our method performs well in practice

    Editorial

    No full text
    International audienc

    Editorial

    No full text
    International audienc

    Combinatorial Algorithms

    No full text
    • 

    corecore