Search CORE

183 research outputs found

Palindrome Recognition In The Streaming Model

Author: Azer Erfan Sadeqi
Berenbrink Petra
Ergün Funda
Mallmann-Trenn Frederik
Publication venue
Publication date: 28/01/2016
Field of study

In the Palindrome Problem one tries to find all palindromes (palindromic substrings) in a given string. A palindrome is defined as a string which reads forwards the same as backwards, e.g., the string "racecar". A related problem is the Longest Palindromic Substring Problem in which finding an arbitrary one of the longest palindromes in the given string suffices. We regard the streaming version of both problems. In the streaming model the input arrives over time and at every point in time we are only allowed to use sublinear space. The main algorithms in this paper are the following: The first one is a one-pass randomized algorithm that solves the Palindrome Problem. It has an additive error and uses

O(\sqrt n

) space. The second algorithm is a two-pass algorithm which determines the exact locations of all longest palindromes. It uses the first algorithm as the first pass. The third algorithm is again a one-pass randomized algorithm, which solves the Longest Palindromic Substring Problem. It has a multiplicative error using only

O(\log(n))

space. We also give two variants of the first algorithm which solve other related practical problems

arXiv.org e-Print Archive

CiteSeerX

Palindrome Recognition In The Streaming Model

Author: Berenbrink Petra
Mallmann-Trenn Frederik
Sadeqi Azer Erfan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)
Publication date: 01/01/2014
Field of study

A palindrome is defined as a string which reads forwards the same as backwards, like, for example, the string "racecar". In the Palindrome Problem, one tries to find all palindromes in a given string. In contrast, in the case of the Longest Palindromic Substring Problem, the goal is to find an arbitrary one of the longest palindromes in the string. In this paper we present three algorithms in the streaming model for the the above problems, where at any point in time we are only allowed to use sublinear space. We first present a one-pass randomized algorithm that solves the Palindrome Problem. It has an additive error and uses square root of n space. We also give two variants of the algorithm which solve related and practical problems. The second algorithm determines the exact locations of all longest palindromes using two passes and square root of n space. The third algorithm is a one-pass randomized algorithm, which solves the Longest Palindromic Substring Problem. It has a multiplicative error using only O(log(n)) space

Dagstuhl Research Online Publication Server

Streaming for Aibohphobes: Longest Palindrome with Mismatches

Author: Grigorescu Elena
Sadeqi Azer Erfan
Zhou Samson
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)
Publication date: 04/05/2017
Field of study

A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given a metric and an integer d>0, a d-near-palindrome} is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying the longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present the first streaming algorithm for the longest d-near-palindrome problem that returns a d-near-palindrome whose length is within a multiplicative (1+eps)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices in the d-near-palindrome, and uses O{frac{dlog^7 n}{epslog(1+eps)}} bits of space, and O{frac{dlog^6 n}{epslog(1+eps)}} update time per arrival symbol. We show that for d=o(sqrt{n}), any randomized algorithm with multiplicative approximation (1+eps) that succeeds with probability at least 1-1/n requires Omega(dlog n) space. We further obtain a streaming algorithm that returns a d-near-palindrome whose length is within an additive E-error of the longest d-near-palindrome. The algorithm uses O{frac{dnlog^6 n}{E}} bits of space and O{frac{dnlog^5 n}{E}} update time. As before, we show that any randomized streaming algorithm that solves the longest d-near-palindrome problem for additive error E with probability at least 1-frac{1}{n}, uses Omegaleft(frac{dn}{E}right) space. Finally, we give an exact two-pass algorithm that solves the longest d-near-palindrome problem using O{d^2sqrt{n}log^6 n} bits of space

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Tight Tradeoffs for Real-Time Approximation of Longest Palindromes in Streams

Author: Gawrychowski Pawel
Merkurev Oleg
Shur Arseny
Uznanski Przemyslaw
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)
Publication date: 01/01/2016
Field of study

We consider computing a longest palindrome in the streaming model, where the symbols arrive one-by-one and we do not have random access to the input. While computing the answer exactly using sublinear space is not possible in such a setting, one can still hope for a good approximation guarantee. Our contribution is twofold. First, we provide lower bounds on the space requirements for randomized approximation algorithms processing inputs of length n. We rule out Las Vegas algorithms, as they cannot achieve sublinear space complexity. For Monte Carlo algorithms, we prove a lower bounds of Omega(M log min {|Sigma|, M}) bits of memory; here M=n/E for approximating the answer with additive error E, and M= log n / log (1 + epsilon) for approximating the answer with multiplicative error (1 + epsilon). Second, we design three real-time algorithms for this problem. Our Monte Carlo approximation algorithms for both additive and multiplicative versions of the problem use O(M) words of memory. Thus the obtained lower bounds are asymptotically tight up to a logarithmic factor. The third algorithm is deterministic and finds a longest palindrome exactly if it is short. This algorithm can be run in parallel with a Monte Carlo algorithm to obtain better results in practice. Overall, both the time and space complexity of finding a longest palindrome in a stream are essentially settled

arXiv.org e-Print Archive

Repository for Publications and Research Data

Dagstuhl Research Online Publication Server

Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares

Author: Bathie Gabriel
Kociumaka Tomasz
Starikovskaya Tatiana
Publication venue
Publication date: 26/09/2023
Field of study

We study the online variant of the language distance problem for two classical formal languages, the language of palindromes and the language of squares, and for the two most fundamental distances, the Hamming distance and the edit (Levenshtein) distance. In this problem, defined for a fixed formal language

L

, we are given a string

T

of length

n

, and the task is to compute the minimal distance to

L

from every prefix of

T

. We focus on the low-distance regime, where one must compute only the distances smaller than a given threshold

k

. In this work, our contribution is twofold: - First, we show streaming algorithms, which access the input string

T

only through a single left-to-right scan. Both for palindromes and squares, our algorithms use

O(k \cdot\mathrm{poly}~\log n)

space and time per character in the Hamming-distance case and

O(k^2 \cdot\mathrm{poly}~\log n)

space and time per character in the edit-distance case. These algorithms are randomised by necessity, and they err with probability inverse-polynomial in

n

. - Second, we show deterministic read-only online algorithms, which are also provided with read-only random access to the already processed characters of

T

. Both for palindromes and squares, our algorithms use

O(k \cdot\mathrm{poly}~\log n)

space and time per character in the Hamming-distance case and

O(k^4 \cdot\mathrm{poly}~\log n)

space and amortised time per character in the edit-distance case.Comment: Accepted to ISAAC'2

arXiv.org e-Print Archive

Faster Queries for Longest Substring Palindrome After Block Edit

Author: Bannai Hideo
Funakoshi Mitsuru
Inenaga Shunsuke
Nakashima Yuto
Takeda Masayuki
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Publication date: 01/01/2019
Field of study

Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query

Dagstuhl Research Online Publication Server