9 research outputs found

    Two algorithms for LCS Consecutive Suffix Alignment

    Get PDF
    AbstractThe problem of aligning two sequences A and B to determine their similarity is one of the fundamental problems in pattern matching. A challenging, basic variation of the sequence similarity problem is the incremental string comparison problem, denoted Consecutive Suffix Alignment, which is, given two strings A and B, to compute the alignment solution of each suffix of A versus B.Here, we present two solutions to the Consecutive Suffix Alignment Problem under the LCS (Longest Common Subsequence) metric, where the LCS metric measures the subsequence of maximal length common to A and B. The first solution is an O(nL) time and space algorithm for constant alphabets, where the size of the compared strings is O(n) and L⩽n denotes the size of the LCS of A and B.The second solution is an O(nL+nlog|Σ|) time and O(n) space algorithm for general alphabets, where Σ denotes the alphabet of the compared strings

    Multivariate Fine-Grained Complexity of Longest Common Subsequence

    Full text link
    We revisit the classic combinatorial pattern matching problem of finding a longest common subsequence (LCS). For strings xx and yy of length nn, a textbook algorithm solves LCS in time O(n2)O(n^2), but although much effort has been spent, no O(n2ε)O(n^{2-\varepsilon})-time algorithm is known. Recent work indeed shows that such an algorithm would refute the Strong Exponential Time Hypothesis (SETH) [Abboud, Backurs, Vassilevska Williams + Bringmann, K\"unnemann FOCS'15]. Despite the quadratic-time barrier, for over 40 years an enduring scientific interest continued to produce fast algorithms for LCS and its variations. Particular attention was put into identifying and exploiting input parameters that yield strongly subquadratic time algorithms for special cases of interest, e.g., differential file comparison. This line of research was successfully pursued until 1990, at which time significant improvements came to a halt. In this paper, using the lens of fine-grained complexity, our goal is to (1) justify the lack of further improvements and (2) determine whether some special cases of LCS admit faster algorithms than currently known. To this end, we provide a systematic study of the multivariate complexity of LCS, taking into account all parameters previously discussed in the literature: the input size n:=max{x,y}n:=\max\{|x|,|y|\}, the length of the shorter string m:=min{x,y}m:=\min\{|x|,|y|\}, the length LL of an LCS of xx and yy, the numbers of deletions δ:=mL\delta := m-L and Δ:=nL\Delta := n-L, the alphabet size, as well as the numbers of matching pairs MM and dominant pairs dd. For any class of instances defined by fixing each parameter individually to a polynomial in terms of the input size, we prove a SETH-based lower bound matching one of three known algorithms. Specifically, we determine the optimal running time for LCS under SETH as (n+min{d,δΔ,δm})1±o(1)(n+\min\{d, \delta \Delta, \delta m\})^{1\pm o(1)}. [...]Comment: Presented at SODA'18. Full Version. 66 page

    Multivariate Fine-Grained Complexity of Longest Common Subsequence

    No full text
    We revisit the classic combinatorial pattern matching problem of finding a longest common subsequence (LCS). For strings xx and yy of length nn, a textbook algorithm solves LCS in time O(n2)O(n^2), but although much effort has been spent, no O(n2ε)O(n^{2-\varepsilon})-time algorithm is known. Recent work indeed shows that such an algorithm would refute the Strong Exponential Time Hypothesis (SETH) [Abboud, Backurs, Vassilevska Williams + Bringmann, K\"unnemann FOCS'15]. Despite the quadratic-time barrier, for over 40 years an enduring scientific interest continued to produce fast algorithms for LCS and its variations. Particular attention was put into identifying and exploiting input parameters that yield strongly subquadratic time algorithms for special cases of interest, e.g., differential file comparison. This line of research was successfully pursued until 1990, at which time significant improvements came to a halt. In this paper, using the lens of fine-grained complexity, our goal is to (1) justify the lack of further improvements and (2) determine whether some special cases of LCS admit faster algorithms than currently known. To this end, we provide a systematic study of the multivariate complexity of LCS, taking into account all parameters previously discussed in the literature: the input size n:=max{x,y}n:=\max\{|x|,|y|\}, the length of the shorter string m:=min{x,y}m:=\min\{|x|,|y|\}, the length LL of an LCS of xx and yy, the numbers of deletions δ:=mL\delta := m-L and Δ:=nL\Delta := n-L, the alphabet size, as well as the numbers of matching pairs MM and dominant pairs dd. For any class of instances defined by fixing each parameter individually to a polynomial in terms of the input size, we prove a SETH-based lower bound matching one of three known algorithms. Specifically, we determine the optimal running time for LCS under SETH as (n+min{d,δΔ,δm})1±o(1)(n+\min\{d, \delta \Delta, \delta m\})^{1\pm o(1)}. [...
    corecore