11 research outputs found

    Palindromic Decompositions with Gaps and Errors

    Full text link
    Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing gaps in decompositions and errors in palindromes, and also imposing a lower bound to the length of acceptable palindromes. We first present an algorithm for obtaining a palindromic decomposition of a string of length n with the minimal total gap length in time O(n log n * g) and space O(n g), where g is the number of allowed gaps in the decomposition. We then consider a decomposition of the string in maximal \delta-palindromes (i.e. palindromes with \delta errors under the edit or Hamming distance) and g allowed gaps. We present an algorithm to obtain such a decomposition with the minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201

    Finding approximate palindromes in strings

    Full text link
    We introduce a novel definition of approximate palindromes in strings, and provide an algorithm to find all maximal approximate palindromes in a string with up to kk errors. Our definition is based on the usual edit operations of approximate pattern matching, and the algorithm we give, for a string of size nn on a fixed alphabet, runs in O(k2n)O(k^2 n) time. We also discuss two implementation-related improvements to the algorithm, and demonstrate their efficacy in practice by means of both experiments and an average-case analysis

    Efficient String Matching on Coded Texts

    Get PDF
    The so called "four Russians technique'' is often used to speed up algorithms by encoding several data items in a single memory cell. Given a sequence of n symbols over a constant size alphabet, one can encode the sequence into O(n / lambda) memory cells in O(log(lambda) ) time using n / log(lambda) processors. This paper presents an efficient CRCW-PRAM string-matching algorithm for coded texts that takes O(log log(m/lambda)) time making only O(n / lambda ) operations, an improvement by a factor of lambda = O(log n) on the number of operations used in previous algorithms. Using this string-matching algorithm one can test if a string is square-free and find all palindromes in a string in O(log log n) time using n / log log n processors

    Finding All Periods and Initial Palindromes of a String in Parallel

    No full text
    An optimal O(log log n) time CRCW-PRAM algorithm for computing all periods of a string is presented. Previous parallel algorithms compute the period only if it is shorter than half of the length of the string. This algorithm can be used to find all initial palindromes of a string in the same time and processor bounds. Both algorithms are the fastest possible over a general alphabet. We derive a lower bound for finding palindromes by a modification of a previously known lower bound for finding the period of a string [3]. When p processors are available the bounds become \Theta(d n p e + log log d1+p=ne 2p)

    Finding All Periods and Initial Palindromes of a String in Parallel

    Get PDF
    An optimal O(log log n) time CRCW-PRAM algorithm for computing all period lengths of a string is presented. Previous parallel algorithms compute the period only if it is shorter than half of the length of the string. The algorithm can be used to find all initial palindromes of a string in the same time and processor bounds. Both algorithms are the fastest possible over a general alphabet. We derive a lower bound for finding initial palindromes by modifying a known lower bound for finding the period length of a string [9]. When p processors are available the bounds become \Theta(d n p e+log log d1+p=ne 2p). 1 Introduction A string S[0::n] has a period S[0::p \Gamma 1] of length p if S[i] = S[i + p] for i = 0 \Delta \Delta \Delta n \Gamma p. The period of S[0::n] is defined as its shortest period. Periodicity properties of strings have been studied extensively [18] and are practically used almost in all efficient sequential and parallel string matching algorithms. A palindrome is a ..
    corecore