Search CORE

2,124 research outputs found

Variants of Constrained Longest Common Subsequence

Author: Adi
Alon
Apostolico
Apostolico
Apostolico
Arslan
Bergroth
Bonizzoni
Chin
Cormen
Downey
Fernandes
Gianluca Della Vedova
Gotthilf
Jiang
Maier
Paola Bonizzoni
Pietrzak
Riccardo Dondi
Räihä
Sankoff
Schmidt
Tsai
Yuri Pirola
Publication venue: 'Elsevier BV'
Publication date: 02/12/2009
Field of study

In this work, we consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings, and a function Co from A to N, the DC-LCS problem consists in finding the longest subsequence s of s1 and s2 such that s is a supersequence of all the strings in Cs and such that the number of occurrences in s of each symbol a in A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution. Secondly, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings and the size of the alphabet A. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant

arXiv.org e-Print Archive

Crossref

An Efficient Dynamic Programming Algorithm for the Generalized LCS Problem with Multiple Substring Exclusion Constrains

Author: Wang Lei
Wang Xiaodong
Wu Yingjie
Zhu Daxin
Publication venue
Publication date: 07/03/2013
Field of study

In this paper, we consider a generalized longest common subsequence problem with multiple substring exclusion constrains. For the two input sequences

X

and

Y

of lengths

n

and

m

, and a set of

d

constrains

P=\{P_1,...,P_d\}

of total length

r

, the problem is to find a common subsequence

Z

X

and

Y

excluding each of constrain string in

P

as a substring and the length of

Z

is maximized. The problem was declared to be NP-hard\cite{1}, but we finally found that this is not true. A new dynamic programming solution for this problem is presented in this paper. The correctness of the new algorithm is proved. The time complexity of our algorithm is

O(nmr)

.Comment: arXiv admin note: substantial text overlap with arXiv:1301.718

arXiv.org e-Print Archive

CiteSeerX

Heuristic algorithms for the Longest Filled Common Subsequence Problem

Author: Mincu Radu Stefan
Popa Alexandru
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2019
Field of study

At CPM 2017, Castelli et al. define and study a new variant of the Longest Common Subsequence Problem, termed the Longest Filled Common Subsequence Problem (LFCS). For the LFCS problem, the input consists of two strings

A

and

B

and a multiset of characters

\mathcal{M}

. The goal is to insert the characters from

\mathcal{M}

into the string

B

, thus obtaining a new string

B^*

, such that the Longest Common Subsequence (LCS) between

A

and

B^*

is maximized. Casteli et al. show that the problem is NP-hard and provide a 3/5-approximation algorithm for the problem. In this paper we study the problem from the experimental point of view. We introduce, implement and test new heuristic algorithms and compare them with the approximation algorithm of Casteli et al. Moreover, we introduce an Integer Linear Program (ILP) model for the problem and we use the state of the art ILP solver, Gurobi, to obtain exact solution for moderate sized instances.Comment: Accepted and presented as a proceedings paper at SYNASC 201

arXiv.org e-Print Archive

Crossref

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

Author: Hertrich Christoph
Skutella Martin
Publication venue
Publication date: 04/01/2021
Field of study

The development of a satisfying and rigorous mathematical understanding of the performance of neural networks is a major challenge in artificial intelligence. Against this background, we study the expressive power of neural networks through the example of the classical NP-hard Knapsack Problem. Our main contribution is a class of recurrent neural networks (RNNs) with rectified linear units that are iteratively applied to each item of a Knapsack instance and thereby compute optimal or provably good solution values. We show that an RNN of depth four and width depending quadratically on the profit of an optimum Knapsack solution is sufficient to find optimum Knapsack solutions. We also prove the following tradeoff between the size of an RNN and the quality of the computed Knapsack solution: for Knapsack instances consisting of

n

items, an RNN of depth five and width

w

computes a solution of value at least

1-\mathcal{O}(n^2/\sqrt{w})

times the optimum solution value. Our results build upon a classical dynamic programming formulation of the Knapsack Problem as well as a careful rounding of profit values that are also at the core of the well-known fully polynomial-time approximation scheme for the Knapsack Problem. A carefully conducted computational study qualitatively supports our theoretical size bounds. Finally, we point out that our results can be generalized to many other combinatorial optimization problems that admit dynamic programming solution methods, such as various Shortest Path Problems, the Longest Common Subsequence Problem, and the Traveling Salesperson Problem.Comment: A short version of this paper appears in the proceedings of AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Repetition-free longest common subsequence of random sequences

Author: Fernandes Cristina G.
Kiwi Marcos
Publication venue
Publication date: 21/05/2013
Field of study

A repetition free Longest Common Subsequence (LCS) of two sequences x and y is an LCS of x and y where each symbol may appear at most once. Let R denote the length of a repetition free LCS of two sequences of n symbols each one chosen randomly, uniformly, and independently over a k-ary alphabet. We study the asymptotic, in n and k, behavior of R and establish that there are three distinct regimes, depending on the relative speed of growth of n and k. For each regime we establish the limiting behavior of R. In fact, we do more, since we actually establish tail bounds for large deviations of R from its limiting behavior. Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2007). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in n and k, behavior of parameter R.Comment: 15 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX