Search CORE

167 research outputs found

Variants of Constrained Longest Common Subsequence

Author: Adi
Alon
Apostolico
Apostolico
Apostolico
Arslan
Bergroth
Bonizzoni
Chin
Cormen
Downey
Fernandes
Gianluca Della Vedova
Gotthilf
Jiang
Maier
Paola Bonizzoni
Pietrzak
Riccardo Dondi
Räihä
Sankoff
Schmidt
Tsai
Yuri Pirola
Publication venue: 'Elsevier BV'
Publication date: 02/12/2009
Field of study

In this work, we consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings, and a function Co from A to N, the DC-LCS problem consists in finding the longest subsequence s of s1 and s2 such that s is a supersequence of all the strings in Cs and such that the number of occurrences in s of each symbol a in A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution. Secondly, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings and the size of the alphabet A. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant

arXiv.org e-Print Archive

Crossref

Combined super-/substring and super-/subsequence problems

Author: Manlove D.F.
Middendorf M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Super-/substring problems and super-/subsequence problems are well-known problems in stringology that have applications in a variety of areas, such as manufacturing systems design and molecular biology. Here we investigate the complexity of a new type of such problem that forms a combination of a super-/substring and a super-/subsequence problem. Moreover we introduce different types of minimal superstring and maximal substring problems. In particular, we consider the following problems: given a set L of strings and a string S, (i) find a minimal superstring (or maximal substring) of L that is also a supersequence (or a subsequence) of S, (ii) find a minimal supersequence (or maximal subsequence) of L that is also a superstring (or a substring) of S. In addition some non-super-/non-substring and non-super-/non-subsequence variants are studied. We obtain several NP-hardness or even MAX SNP-hardness results and also identify types of "weak minimal" superstrings and "weak maximal" substrings for which (i) is polynomial-time solvable

CiteSeerX

Elsevier - Publisher Connector

Enlighten

Towards a better solution to the shortest common supersequence problem: the deposition and reduction algorithm

Author: D Gusfield
D Sankoff
DE Foulser
EA Hubbell
G Nicosia
Hon Wai Leong
J Branke
JA Storer
K Ning
Kang Ning
P Barone
R Michels
RW Irving
S Kasif
T Jiang
TH Cormen
TK Sellis
VG Timkovsky
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The problem of finding a Shortest Common Supersequence (SCS) of a set of sequences is an important problem with applications in many areas. It is a key problem in biological sequences analysis. The SCS problem is well-known to be NP-complete. Many heuristic algorithms have been proposed. Some heuristics work well on a few long sequences (as in sequence comparison applications); others work well on many short sequences (as in oligo-array synthesis). Unfortunately, most do not work well on large SCS instances where there are many, long sequences. RESULTS: In this paper, we present a Deposition and Reduction (DR) algorithm for solving large SCS instances of biological sequences. There are two processes in our DR algorithm: deposition process, and reduction process. The deposition process is responsible for generating a small set of common supersequences; and the reduction process shortens these common supersequences by removing some characters while preserving the common supersequence property. Our evaluation on simulated data and real DNA and protein sequences show that our algorithm consistently produces the best results compared to many well-known heuristic algorithms, and especially on large instances. CONCLUSION: Our DR algorithm provides a partial answer to the open problem of designing efficient heuristic algorithm for SCS problem on many long sequences. Our algorithm has a bounded approximation ratio. The algorithm is efficient, both in running time and space complexity and our evaluation shows that it is practical even for SCS problems on many long sequences

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences

The Loading Time Scheduling Problem

Author: Bhatia Randeep
Khuller Samir
Naor Joseph (Seffi)
Publication venue
Publication date: 15/10/1998
Field of study

In this paper we study precedence constrained scheduling problems, where the tasks can only be executed on a specified subset of the machines. Each machine has a loading time that is incurred only for the first task that is scheduled on the machine in a particular run. This basic scheduling problem arises in the context of machining on numerically controlled machines, query optimization in databases, and in other artificial intelligence applications. We give the first non-trivial approximation algorithm for this problem. We also prove non-trivial lower bounds on best possible approximation ratios for these problems. These improve on the non-approximability results that are implied by the non-approximability results for the shortests common supersequence problem. We use the same algorithmic technique to obtain approximation algorithms for a problem arising in the context of code generation for parallel machines, and for the weighted shortest common supersequence problem

Digital Repository at the University of Maryland

Planet Packing Revisited

Author: Richards Dana
Publication venue: Digital Commons @ Butler University
Publication date: 01/08/2001
Field of study

Ross Eckler discusses a problem in his article Planet Packing in the May 2001 Word Ways: given a list of words, such as the names of the planets, how efficiently can they be packed into a single string of characters so that each word on the list can be read off left to right (but not necessarily contiguously)? He hypothesizes there is no guarantee that any algorithm will end up with a minimum string. Since the design and analysis of algorithms has been my area of research for some 25 years, this caught my attention. Informally, an algorithm is a terminating procedure that could be coded as a computer program. (However, the procedure in the Planet Packing article does not contain enough tie-breaking rules to qualify as an algorithm)

Digital Commons @ Butler University

Planet Packing Revisited

Author: Ashikawa N.
Bozhenkov S.
Kantor M.
Litnovsky A.
Philipps V.
Pospieszczyk A.
Ratynskaia S.
Rudakov D.
Tsalas M.
Publication venue: Digital Commons @ Butler University
Publication date: 01/08/2001
Field of study

Crossref

DIFFER: Publications

Juelich Shared Electronic Resources

Digital Commons @ Butler University

MPG.PuRe