2,646 research outputs found
An Algorithm for the Constrained Longest Common Subsequence and Substring Problem
Let be an alphabet. For two strings , , and a constrained
string over the alphabet , the constrained longest common
subsequence and substring problem for two strings and with respect to
is to find a longest string which is a subsequence of , a substring
of , and has as a subsequence. In this paper, we propose an algorithm
for the constrained longest common subsequence and substring problem for two
strings with a constrained string.Comment: arXiv admin note: text overlap with arXiv:2308.0092
Variants of Constrained Longest Common Subsequence
In this work, we consider a variant of the classical Longest Common
Subsequence problem called Doubly-Constrained Longest Common Subsequence
(DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings,
and a function Co from A to N, the DC-LCS problem consists in finding the
longest subsequence s of s1 and s2 such that s is a supersequence of all the
strings in Cs and such that the number of occurrences in s of each symbol a in
A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical
formulation of a sequence comparison problem in Computational Biology and
generalizes two other constrained variants of the LCS problem: the Constrained
LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem.
First, we illustrate a fixed-parameter algorithm where the parameter is the
length of the solution. Secondly, we prove a parameterized hardness result for
the Constrained LCS problem when the parameter is the number of the constraint
strings and the size of the alphabet A. This hardness result also implies the
parameterized hardness of the DC-LCS problem (with the same parameters) and its
NP-hardness when the size of the alphabet is constant
Longest Common Subsequence with Gap Constraints
We consider the longest common subsequence problem in the context of
subsequences with gap constraints. In particular, following Day et al. 2022, we
consider the setting when the distance (i. e., the gap) between two consecutive
symbols of the subsequence has to be between a lower and an upper bound (which
may depend on the position of those symbols in the subsequence or on the
symbols bordering the gap) as well as the case where the entire subsequence is
found in a bounded range (defined by a single upper bound), considered by
Kosche et al. 2022. In all these cases, we present effcient algorithms for
determining the length of the longest common constrained subsequence between
two given strings
Constrained Longest Common Subsequence Computing Algorithms in Practice
The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that P is a subsequence of C. There are several algorithms solving the CLCS problem, but there is no real experimental comparison of them. The paper has two aims. Firstly, we propose an improvement to the algorithms by Chin et al. and Deorowicz based on an entry-exit points technique by He and Arslan. Secondly, we compare experimentally the existing algorithms for solving the CLCS problem
The substring inclusion constraint longest common subsequence problem can be solved in quadratic time
AbstractIn this paper, we study some variants of the Constrained Longest Common Subsequence (CLCS) problem, namely, the substring inclusion CLCS (Substring-IC-CLCS) problem and a generalized version thereof. In the Substring-IC-CLCS problem, we are to find a longest common subsequence (LCS) of two given strings containing a third constraint string (given) as a substring. Previous solution to this problem runs in cubic time, i.e, O(nmk) time, where n,m and k are the length of the 3 input strings. In this paper, we present simple O(nm) time algorithms to solve the Substring-IC-CLCS problem. We also study the Generalized Substring-IC-LCS problem where we are given two strings of length n and m respectively and an ordered list of p strings and the goal is to find an LCS containing each of them as a substring in the order they appear in the list. We present an O(nmp) algorithm for this generalized version of the problem
Faster STR-IC-LCS Computation via RLE
The constrained LCS problem asks one to find a longest common subsequence of two input strings A and B with some constraints. The STR-IC-LCS problem is a variant of the constrained LCS problem, where the solution must include a given constraint string C as a substring. Given two strings A and B of respective lengths M and N, and a constraint string C of length at most min{M, N}, the best known algorithm for the STR-IC-LCS problem, proposed by Deorowicz (Inf. Process. Lett., 11:423-426, 2012), runs in O(MN) time. In this work, we present an O(mN + nM)-time solution to the STR-IC-LCS problem, where m and n denote the sizes of the run-length encodings of A and B, respectively. Since m <= M and n <= N always hold, our algorithm is always as fast as Deorowicz\u27s algorithm, and is faster when input strings are compressible via RLE
An Efficient Dynamic Programming Algorithm for the Generalized LCS Problem with Multiple Substring Exclusion Constrains
In this paper, we consider a generalized longest common subsequence problem
with multiple substring exclusion constrains. For the two input sequences
and of lengths and , and a set of constrains
of total length , the problem is to find a common subsequence of and
excluding each of constrain string in as a substring and the length of
is maximized. The problem was declared to be NP-hard\cite{1}, but we
finally found that this is not true. A new dynamic programming solution for
this problem is presented in this paper. The correctness of the new algorithm
is proved. The time complexity of our algorithm is .Comment: arXiv admin note: substantial text overlap with arXiv:1301.718
- …