2 research outputs found

    Linear Algorithm for Conservative Degenerate Pattern Matching

    Full text link
    A degenerate symbol x* over an alphabet A is a non-empty subset of A, and a sequence of such symbols is a degenerate string. A degenerate string is said to be conservative if its number of non-solid symbols is upper-bounded by a fixed positive constant k. We consider here the matching problem of conservative degenerate strings and present the first linear-time algorithm that can find, for given degenerate strings P* and T* of total length n containing k non-solid symbols in total, the occurrences of P* in T* in O(nk) time

    Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform

    Get PDF
    International audienceA degenerate or indeterminate string on an alphabet ÎŁ is a sequence of non-empty subsets of ÎŁ. Given a degenerate string t of length n, we present a new method based on the Burrows--Wheeler transform for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet ÎŁ. Furthermore, it is a hybrid pattern-matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search complexity time is O(qm2). Experimental results show that our method performs well in practice
    corecore