1,505 research outputs found

    An Efficient Dynamic Programming Algorithm for the Generalized LCS Problem with Multiple Substring Exclusion Constrains

    Full text link
    In this paper, we consider a generalized longest common subsequence problem with multiple substring exclusion constrains. For the two input sequences XX and YY of lengths nn and mm, and a set of dd constrains P={P1,...,Pd}P=\{P_1,...,P_d\} of total length rr, the problem is to find a common subsequence ZZ of XX and YY excluding each of constrain string in PP as a substring and the length of ZZ is maximized. The problem was declared to be NP-hard\cite{1}, but we finally found that this is not true. A new dynamic programming solution for this problem is presented in this paper. The correctness of the new algorithm is proved. The time complexity of our algorithm is O(nmr)O(nmr).Comment: arXiv admin note: substantial text overlap with arXiv:1301.718

    Variants of Constrained Longest Common Subsequence

    Full text link
    In this work, we consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings, and a function Co from A to N, the DC-LCS problem consists in finding the longest subsequence s of s1 and s2 such that s is a supersequence of all the strings in Cs and such that the number of occurrences in s of each symbol a in A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution. Secondly, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings and the size of the alphabet A. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant

    The substring inclusion constraint longest common subsequence problem can be solved in quadratic time

    Get PDF
    AbstractIn this paper, we study some variants of the Constrained Longest Common Subsequence (CLCS) problem, namely, the substring inclusion CLCS (Substring-IC-CLCS) problem and a generalized version thereof. In the Substring-IC-CLCS problem, we are to find a longest common subsequence (LCS) of two given strings containing a third constraint string (given) as a substring. Previous solution to this problem runs in cubic time, i.e, O(nmk) time, where n,m and k are the length of the 3 input strings. In this paper, we present simple O(nm) time algorithms to solve the Substring-IC-CLCS problem. We also study the Generalized Substring-IC-LCS problem where we are given two strings of length n and m respectively and an ordered list of p strings and the goal is to find an LCS containing each of them as a substring in the order they appear in the list. We present an O(nmp) algorithm for this generalized version of the problem

    Subsequence Automata with Default Transitions

    Get PDF
    Let SS be a string of length nn with characters from an alphabet of size σ\sigma. The \emph{subsequence automaton} of SS (often called the \emph{directed acyclic subsequence graph}) is the minimal deterministic finite automaton accepting all subsequences of SS. A straightforward construction shows that the size (number of states and transitions) of the subsequence automaton is O(nσ)O(n\sigma) and that this bound is asymptotically optimal. In this paper, we consider subsequence automata with \emph{default transitions}, that is, special transitions to be taken only if none of the regular transitions match the current character, and which do not consume the current character. We show that with default transitions, much smaller subsequence automata are possible, and provide a full trade-off between the size of the automaton and the \emph{delay}, i.e., the maximum number of consecutive default transitions followed before consuming a character. Specifically, given any integer parameter kk, 1<kσ1 < k \leq \sigma, we present a subsequence automaton with default transitions of size O(nklogkσ)O(nk\log_{k}\sigma) and delay O(logkσ)O(\log_k \sigma). Hence, with k=2k = 2 we obtain an automaton of size O(nlogσ)O(n \log \sigma) and delay O(logσ)O(\log \sigma). On the other extreme, with k=σk = \sigma, we obtain an automaton of size O(nσ)O(n \sigma) and delay O(1)O(1), thus matching the bound for the standard subsequence automaton construction. Finally, we generalize the result to multiple strings. The key component of our result is a novel hierarchical automata construction of independent interest.Comment: Corrected typo

    Algebraic aspects of increasing subsequences

    Get PDF
    We present a number of results relating partial Cauchy-Littlewood sums, integrals over the compact classical groups, and increasing subsequences of permutations. These include: integral formulae for the distribution of the longest increasing subsequence of a random involution with constrained number of fixed points; new formulae for partial Cauchy-Littlewood sums, as well as new proofs of old formulae; relations of these expressions to orthogonal polynomials on the unit circle; and explicit bases for invariant spaces of the classical groups, together with appropriate generalizations of the straightening algorithm.Comment: LaTeX+amsmath+eepic; 52 pages. Expanded introduction, new references, other minor change
    corecore