112 research outputs found

    Palindromic Decompositions with Gaps and Errors

    Full text link
    Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing gaps in decompositions and errors in palindromes, and also imposing a lower bound to the length of acceptable palindromes. We first present an algorithm for obtaining a palindromic decomposition of a string of length n with the minimal total gap length in time O(n log n * g) and space O(n g), where g is the number of allowed gaps in the decomposition. We then consider a decomposition of the string in maximal \delta-palindromes (i.e. palindromes with \delta errors under the edit or Hamming distance) and g allowed gaps. We present an algorithm to obtain such a decomposition with the minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201

    Finding Maximal 2-Dimensional Palindromes

    Get PDF
    This paper extends the problem of palindrome searching into a higher dimension, addressing two definitions of 2D palindromes. The first definition implies a square, while the second definition (also known as a centrosymmetric factor), can be any rectangular shape. We describe two algorithms for searching a 2D text for maximal palindromes, one for each type of 2D palindrome. The first algorithm is optimal; it runs in linear time, on par with Manacher\u27s linear time 1D palindrome algorithm. The second algorithm searches a text of size n_1 x n_2 (n_1 >= n_2) in O(n_2) time for each of its n_1 x n_2 positions. Since each position may have up to O(n_2) maximal palindromes centered at that location, the second result is also optimal in terms of the worst-case output size

    Average-Case Optimal Approximate Circular String Matching

    Full text link
    Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that are at a distance at most k from x or from any of its rotations. In this article, we present a new algorithm for approximate circular string matching under the edit distance model with optimal average-case search time O(n(k + log m)/m). Optimal average-case search time can also be achieved by the algorithms for multiple approximate string matching (Fredriksson and Navarro, 2004) using x and its rotations as the set of multiple patterns. Here we reduce the preprocessing time and space requirements compared to that approach

    Longest substring palindrome after edit

    Get PDF
    It is known that the length of the longest substring palindromes (LSPals) of a given string T of length n can be computed in O(n) time by Manacher\u27s algorithm [J. ACM \u2775]. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(log (min {sigma, log n })) time after single character substitution, insertion, or deletion, where sigma denotes the number of distinct characters appearing in T. We also propose an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log n) time, after an existing substring in T is replaced by a string of arbitrary length l

    Faster Queries for Longest Substring Palindrome After Block Edit

    Get PDF
    Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query
    • …
    corecore