183 research outputs found

    Palindrome Recognition In The Streaming Model

    Full text link
    In the Palindrome Problem one tries to find all palindromes (palindromic substrings) in a given string. A palindrome is defined as a string which reads forwards the same as backwards, e.g., the string "racecar". A related problem is the Longest Palindromic Substring Problem in which finding an arbitrary one of the longest palindromes in the given string suffices. We regard the streaming version of both problems. In the streaming model the input arrives over time and at every point in time we are only allowed to use sublinear space. The main algorithms in this paper are the following: The first one is a one-pass randomized algorithm that solves the Palindrome Problem. It has an additive error and uses O(nO(\sqrt n) space. The second algorithm is a two-pass algorithm which determines the exact locations of all longest palindromes. It uses the first algorithm as the first pass. The third algorithm is again a one-pass randomized algorithm, which solves the Longest Palindromic Substring Problem. It has a multiplicative error using only O(log(n))O(\log(n)) space. We also give two variants of the first algorithm which solve other related practical problems

    Palindrome Recognition In The Streaming Model

    Get PDF
    A palindrome is defined as a string which reads forwards the same as backwards, like, for example, the string "racecar". In the Palindrome Problem, one tries to find all palindromes in a given string. In contrast, in the case of the Longest Palindromic Substring Problem, the goal is to find an arbitrary one of the longest palindromes in the string. In this paper we present three algorithms in the streaming model for the the above problems, where at any point in time we are only allowed to use sublinear space. We first present a one-pass randomized algorithm that solves the Palindrome Problem. It has an additive error and uses square root of n space. We also give two variants of the algorithm which solve related and practical problems. The second algorithm determines the exact locations of all longest palindromes using two passes and square root of n space. The third algorithm is a one-pass randomized algorithm, which solves the Longest Palindromic Substring Problem. It has a multiplicative error using only O(log(n)) space

    Streaming for Aibohphobes: Longest Palindrome with Mismatches

    Get PDF
    A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given a metric and an integer d>0, a d-near-palindrome} is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying the longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present the first streaming algorithm for the longest d-near-palindrome problem that returns a d-near-palindrome whose length is within a multiplicative (1+eps)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices in the d-near-palindrome, and uses O{frac{dlog^7 n}{epslog(1+eps)}} bits of space, and O{frac{dlog^6 n}{epslog(1+eps)}} update time per arrival symbol. We show that for d=o(sqrt{n}), any randomized algorithm with multiplicative approximation (1+eps) that succeeds with probability at least 1-1/n requires Omega(dlog n) space. We further obtain a streaming algorithm that returns a d-near-palindrome whose length is within an additive E-error of the longest d-near-palindrome. The algorithm uses O{frac{dnlog^6 n}{E}} bits of space and O{frac{dnlog^5 n}{E}} update time. As before, we show that any randomized streaming algorithm that solves the longest d-near-palindrome problem for additive error E with probability at least 1-frac{1}{n}, uses Omegaleft(frac{dn}{E}right) space. Finally, we give an exact two-pass algorithm that solves the longest d-near-palindrome problem using O{d^2sqrt{n}log^6 n} bits of space

    Tight Tradeoffs for Real-Time Approximation of Longest Palindromes in Streams

    Get PDF
    We consider computing a longest palindrome in the streaming model, where the symbols arrive one-by-one and we do not have random access to the input. While computing the answer exactly using sublinear space is not possible in such a setting, one can still hope for a good approximation guarantee. Our contribution is twofold. First, we provide lower bounds on the space requirements for randomized approximation algorithms processing inputs of length n. We rule out Las Vegas algorithms, as they cannot achieve sublinear space complexity. For Monte Carlo algorithms, we prove a lower bounds of Omega(M log min {|Sigma|, M}) bits of memory; here M=n/E for approximating the answer with additive error E, and M= log n / log (1 + epsilon) for approximating the answer with multiplicative error (1 + epsilon). Second, we design three real-time algorithms for this problem. Our Monte Carlo approximation algorithms for both additive and multiplicative versions of the problem use O(M) words of memory. Thus the obtained lower bounds are asymptotically tight up to a logarithmic factor. The third algorithm is deterministic and finds a longest palindrome exactly if it is short. This algorithm can be run in parallel with a Monte Carlo algorithm to obtain better results in practice. Overall, both the time and space complexity of finding a longest palindrome in a stream are essentially settled

    Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares

    Full text link
    We study the online variant of the language distance problem for two classical formal languages, the language of palindromes and the language of squares, and for the two most fundamental distances, the Hamming distance and the edit (Levenshtein) distance. In this problem, defined for a fixed formal language LL, we are given a string TT of length nn, and the task is to compute the minimal distance to LL from every prefix of TT. We focus on the low-distance regime, where one must compute only the distances smaller than a given threshold kk. In this work, our contribution is twofold: - First, we show streaming algorithms, which access the input string TT only through a single left-to-right scan. Both for palindromes and squares, our algorithms use O(kpoly logn)O(k \cdot\mathrm{poly}~\log n) space and time per character in the Hamming-distance case and O(k2poly logn)O(k^2 \cdot\mathrm{poly}~\log n) space and time per character in the edit-distance case. These algorithms are randomised by necessity, and they err with probability inverse-polynomial in nn. - Second, we show deterministic read-only online algorithms, which are also provided with read-only random access to the already processed characters of TT. Both for palindromes and squares, our algorithms use O(kpoly logn)O(k \cdot\mathrm{poly}~\log n) space and time per character in the Hamming-distance case and O(k4poly logn)O(k^4 \cdot\mathrm{poly}~\log n) space and amortised time per character in the edit-distance case.Comment: Accepted to ISAAC'2

    Faster Queries for Longest Substring Palindrome After Block Edit

    Get PDF
    Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query
    corecore