2,124 research outputs found
Variants of Constrained Longest Common Subsequence
In this work, we consider a variant of the classical Longest Common
Subsequence problem called Doubly-Constrained Longest Common Subsequence
(DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings,
and a function Co from A to N, the DC-LCS problem consists in finding the
longest subsequence s of s1 and s2 such that s is a supersequence of all the
strings in Cs and such that the number of occurrences in s of each symbol a in
A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical
formulation of a sequence comparison problem in Computational Biology and
generalizes two other constrained variants of the LCS problem: the Constrained
LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem.
First, we illustrate a fixed-parameter algorithm where the parameter is the
length of the solution. Secondly, we prove a parameterized hardness result for
the Constrained LCS problem when the parameter is the number of the constraint
strings and the size of the alphabet A. This hardness result also implies the
parameterized hardness of the DC-LCS problem (with the same parameters) and its
NP-hardness when the size of the alphabet is constant
An Efficient Dynamic Programming Algorithm for the Generalized LCS Problem with Multiple Substring Exclusion Constrains
In this paper, we consider a generalized longest common subsequence problem
with multiple substring exclusion constrains. For the two input sequences
and of lengths and , and a set of constrains
of total length , the problem is to find a common subsequence of and
excluding each of constrain string in as a substring and the length of
is maximized. The problem was declared to be NP-hard\cite{1}, but we
finally found that this is not true. A new dynamic programming solution for
this problem is presented in this paper. The correctness of the new algorithm
is proved. The time complexity of our algorithm is .Comment: arXiv admin note: substantial text overlap with arXiv:1301.718
Heuristic algorithms for the Longest Filled Common Subsequence Problem
At CPM 2017, Castelli et al. define and study a new variant of the Longest
Common Subsequence Problem, termed the Longest Filled Common Subsequence
Problem (LFCS). For the LFCS problem, the input consists of two strings and
and a multiset of characters . The goal is to insert the
characters from into the string , thus obtaining a new string
, such that the Longest Common Subsequence (LCS) between and is
maximized. Casteli et al. show that the problem is NP-hard and provide a
3/5-approximation algorithm for the problem.
In this paper we study the problem from the experimental point of view. We
introduce, implement and test new heuristic algorithms and compare them with
the approximation algorithm of Casteli et al. Moreover, we introduce an Integer
Linear Program (ILP) model for the problem and we use the state of the art ILP
solver, Gurobi, to obtain exact solution for moderate sized instances.Comment: Accepted and presented as a proceedings paper at SYNASC 201
Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size
The development of a satisfying and rigorous mathematical understanding of
the performance of neural networks is a major challenge in artificial
intelligence. Against this background, we study the expressive power of neural
networks through the example of the classical NP-hard Knapsack Problem. Our
main contribution is a class of recurrent neural networks (RNNs) with rectified
linear units that are iteratively applied to each item of a Knapsack instance
and thereby compute optimal or provably good solution values. We show that an
RNN of depth four and width depending quadratically on the profit of an optimum
Knapsack solution is sufficient to find optimum Knapsack solutions. We also
prove the following tradeoff between the size of an RNN and the quality of the
computed Knapsack solution: for Knapsack instances consisting of items, an
RNN of depth five and width computes a solution of value at least
times the optimum solution value. Our results
build upon a classical dynamic programming formulation of the Knapsack Problem
as well as a careful rounding of profit values that are also at the core of the
well-known fully polynomial-time approximation scheme for the Knapsack Problem.
A carefully conducted computational study qualitatively supports our
theoretical size bounds. Finally, we point out that our results can be
generalized to many other combinatorial optimization problems that admit
dynamic programming solution methods, such as various Shortest Path Problems,
the Longest Common Subsequence Problem, and the Traveling Salesperson Problem.Comment: A short version of this paper appears in the proceedings of AAAI 202
Repetition-free longest common subsequence of random sequences
A repetition free Longest Common Subsequence (LCS) of two sequences x and y
is an LCS of x and y where each symbol may appear at most once. Let R denote
the length of a repetition free LCS of two sequences of n symbols each one
chosen randomly, uniformly, and independently over a k-ary alphabet. We study
the asymptotic, in n and k, behavior of R and establish that there are three
distinct regimes, depending on the relative speed of growth of n and k. For
each regime we establish the limiting behavior of R. In fact, we do more, since
we actually establish tail bounds for large deviations of R from its limiting
behavior.
Our study is motivated by the so called exemplar model proposed by Sankoff
(1999) and the related similarity measure introduced by Adi et al. (2007). A
natural question that arises in this context, which as we show is related to
long standing open problems in the area of probabilistic combinatorics, is to
understand the asymptotic, in n and k, behavior of parameter R.Comment: 15 pages, 1 figur
- …