Search CORE

2,353 research outputs found

Bounds on the Number of Longest Common Subsequences

Author: Greenberg Ronald I.
Publication venue
Publication date: 01/08/2003
Field of study

This paper performs the analysis necessary to bound the running time of known, efficient algorithms for generating all longest common subsequences. That is, we bound the running time as a function of input size for algorithms with time essentially proportional to the output size. This paper considers both the case of computing all distinct LCSs and the case of computing all LCS embeddings. Also included is an analysis of how much better the efficient algorithms are than the standard method of generating LCS embeddings. A full analysis is carried out with running times measured as a function of the total number of input characters, and much of the analysis is also provided for cases in which the two input sequences are of the same specified length or of two independently specified lengths.Comment: 13 pages. Corrected typos, corrected operation of hyperlinks, improved presentatio

arXiv.org e-Print Archive

Loyola eCommons

Computing the Number of Longest Common Subsequences

Author: Greenberg Ronald I.
Publication venue
Publication date: 01/01/2003
Field of study

This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings.Comment: 3 pages, LaTe

arXiv.org e-Print Archive

Loyola eCommons

Faster Approximate String Matching for Short Patterns

Author: A. Andersson
A.H. Wright
D. Gusfield
D. Harel
D.E. Knuth
E. Ukkonen
E. Ukkonen
E.W. Myers
F.T. Leighton
G. Myers
G. Navarro
G.M. Landau
H. Hyyrö
K.E. Batcher
M. Farach-Colton
M.A. Bender
P. Bille
P. Sellers
Philip Bille
R. Baeza-Yates
R. Cole
R.A. Baeza-Yates
R.A. Wagner
S. Albers
S. Alstrup
S. Wu
S.C. Sahinalp
T. Hagerup
T.H. Cormen
V.L. Arlazarov
W. Masek
Z. Galil
Z. Galil
Publication venue
Publication date: 17/03/2011
Field of study

We study the classical approximate string matching problem, that is, given strings

P

and

Q

and an error threshold

k

, find all ending positions of substrings of

Q

whose edit distance to

P

is at most

k

. Let

P

and

Q

have lengths

m

and

n

, respectively. On a standard unit-cost word RAM with word size

w \geq \log n

we present an algorithm using time

O(nk \cdot \min(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}) + n)

When

P

is short, namely,

m = 2^{o(\sqrt{\log n})}

m = 2^{o(\sqrt{w/\log w})}

this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.Comment: To appear in Theory of Computing System

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology