1,478 research outputs found
PalFM-Index: FM-Index for Palindrome Pattern Matching
n} in the preprocessing phase
Finding approximate palindromes in strings
We introduce a novel definition of approximate palindromes in strings, and
provide an algorithm to find all maximal approximate palindromes in a string
with up to errors. Our definition is based on the usual edit operations of
approximate pattern matching, and the algorithm we give, for a string of size
on a fixed alphabet, runs in time. We also discuss two
implementation-related improvements to the algorithm, and demonstrate their
efficacy in practice by means of both experiments and an average-case analysis
Palindrome Recognition In The Streaming Model
In the Palindrome Problem one tries to find all palindromes (palindromic
substrings) in a given string. A palindrome is defined as a string which reads
forwards the same as backwards, e.g., the string "racecar". A related problem
is the Longest Palindromic Substring Problem in which finding an arbitrary one
of the longest palindromes in the given string suffices. We regard the
streaming version of both problems. In the streaming model the input arrives
over time and at every point in time we are only allowed to use sublinear
space. The main algorithms in this paper are the following: The first one is a
one-pass randomized algorithm that solves the Palindrome Problem. It has an
additive error and uses ) space. The second algorithm is a two-pass
algorithm which determines the exact locations of all longest palindromes. It
uses the first algorithm as the first pass. The third algorithm is again a
one-pass randomized algorithm, which solves the Longest Palindromic Substring
Problem. It has a multiplicative error using only space. We also
give two variants of the first algorithm which solve other related practical
problems
Palindromic Decompositions with Gaps and Errors
Identifying palindromes in sequences has been an interesting line of research
in combinatorics on words and also in computational biology, after the
discovery of the relation of palindromes in the DNA sequence with the HIV
virus. Efficient algorithms for the factorization of sequences into palindromes
and maximal palindromes have been devised in recent years. We extend these
studies by allowing gaps in decompositions and errors in palindromes, and also
imposing a lower bound to the length of acceptable palindromes.
We first present an algorithm for obtaining a palindromic decomposition of a
string of length n with the minimal total gap length in time O(n log n * g) and
space O(n g), where g is the number of allowed gaps in the decomposition. We
then consider a decomposition of the string in maximal \delta-palindromes (i.e.
palindromes with \delta errors under the edit or Hamming distance) and g
allowed gaps. We present an algorithm to obtain such a decomposition with the
minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201
- …