4,203 research outputs found

    Sublinear Space Algorithms for the Longest Common Substring Problem

    Full text link
    Given mm documents of total length nn, we consider the problem of finding a longest string common to at least d2d \geq 2 of the documents. This problem is known as the \emph{longest common substring (LCS) problem} and has a classic O(n)O(n) space and O(n)O(n) time solution (Weiner [FOCS'73], Hui [CPM'92]). However, the use of linear space is impractical in many applications. In this paper we show that for any trade-off parameter 1τn1 \leq \tau \leq n, the LCS problem can be solved in O(τ)O(\tau) space and O(n2/τ)O(n^2/\tau) time, thus providing the first smooth deterministic time-space trade-off from constant to linear space. The result uses a new and very simple algorithm, which computes a τ\tau-additive approximation to the LCS in O(n2/τ)O(n^2/\tau) time and O(1)O(1) space. We also show a time-space trade-off lower bound for deterministic branching programs, which implies that any deterministic RAM algorithm solving the LCS problem on documents from a sufficiently large alphabet in O(τ)O(\tau) space must use Ω(nlog(n/(τlogn))/loglog(n/(τlogn))\Omega(n\sqrt{\log(n/(\tau\log n))/\log\log(n/(\tau\log n)}) time.Comment: Accepted to 22nd European Symposium on Algorithm

    An Algorithm for the Longest Common Subsequence and Substring Problem

    Full text link
    In this note, we first introduce a new problem called the longest common subsequence and substring problem. Let XX and YY be two strings over an alphabet Σ\Sigma. The longest common subsequence and substring problem for XX and YY is to find the longest string which is a subsequence of XX and a substring of YY. We propose an algorithm to solve the problem

    The substring inclusion constraint longest common subsequence problem can be solved in quadratic time

    Get PDF
    AbstractIn this paper, we study some variants of the Constrained Longest Common Subsequence (CLCS) problem, namely, the substring inclusion CLCS (Substring-IC-CLCS) problem and a generalized version thereof. In the Substring-IC-CLCS problem, we are to find a longest common subsequence (LCS) of two given strings containing a third constraint string (given) as a substring. Previous solution to this problem runs in cubic time, i.e, O(nmk) time, where n,m and k are the length of the 3 input strings. In this paper, we present simple O(nm) time algorithms to solve the Substring-IC-CLCS problem. We also study the Generalized Substring-IC-LCS problem where we are given two strings of length n and m respectively and an ordered list of p strings and the goal is to find an LCS containing each of them as a substring in the order they appear in the list. We present an O(nmp) algorithm for this generalized version of the problem

    An Algorithm for the Constrained Longest Common Subsequence and Substring Problem

    Full text link
    Let Σ\Sigma be an alphabet. For two strings XX, YY, and a constrained string PP over the alphabet Σ\Sigma, the constrained longest common subsequence and substring problem for two strings XX and YY with respect to PP is to find a longest string ZZ which is a subsequence of XX, a substring of YY, and has PP as a subsequence. In this paper, we propose an algorithm for the constrained longest common subsequence and substring problem for two strings with a constrained string.Comment: arXiv admin note: text overlap with arXiv:2308.0092

    Longest common substring with approximately k mismatches

    Get PDF
    In the longest common substring problem we are given two strings of length n and must find a substring of maximal length that occurs in both strings. It is well-known that the problem can be solved in linear time, but the solution is not robust and can vary greatly when the input strings are changed even by one letter. To circumvent this, Leimester and Morgenstern introduced the problem of the longest common substring with k mismatches. Lately, this problem has received a lot of attention in the literature, and several algorithms have been suggested. The running time of these algorithms is n^{2-o(1)}, and unfortunately, conditional lower bounds have been shown which imply that there is little hope to improve this bound. In this paper we study a different but closely related problem of the longest common substring with approximately k mismatches and use computational geometry techniques to show that it admits a randomised solution with strongly subquadratic running time

    Longest common substrings with k mismatches

    Get PDF
    The longest common substring with k-mismatches problem is to find, given two strings S-1 and S-2, a longest substring A(1) of S-1 and A(2) of S-2 such that the Hamming distance between A(1) and A(2) isPeer reviewe
    corecore