3 research outputs found
Communication and Streaming Complexity of Approximate Pattern Matching
We consider the approximate pattern matching problem. Given a text T of length 2n and a pattern P of length n, the task is to decide for each prefix T[1, j] of T if it ends with a string that is at the edit distance at most k from P. If this is the case, we must output the edit distance and the corresponding edit operations. We first show the communication complexity of the problem for the case when Alice and Bob both share the pattern and Alice holds the first half of the text and Bob the second half, and for the case when Alice holds the first half of the text, Bob the second half of the text, and Charlie the pattern. We then develop the first sublinear-space streaming algorithm for the problem. The algorithm is randomised with error probability at most 1/poly(n)
Small space and streaming pattern matching with k edits
In this work, we revisit the fundamental and well-studied problem of
approximate pattern matching under edit distance. Given an integer , a
pattern of length , and a text of length , the task is to
find substrings of that are within edit distance from . Our main
result is a streaming algorithm that solves the problem in
space and amortised time per character of the text, providing
answers correct with high probability. (Hereafter, hides a
factor.) This answers a decade-old question: since the
discovery of a -space streaming algorithm for pattern
matching under Hamming distance by Porat and Porat [FOCS 2009], the existence
of an analogous result for edit distance remained open. Up to this work, no
-space algorithm was known even in the simpler
semi-streaming model, where comes as a stream but is available for
read-only access. In this model, we give a deterministic algorithm that
achieves slightly better complexity.
In order to develop the fully streaming algorithm, we introduce a new edit
distance sketch parametrised by integers . For any string of length at
most , the sketch is of size and it can be computed with an
-space streaming algorithm. Given the sketches of two strings,
in time we can compute their edit distance or certify that it
is larger than . This result improves upon -size sketches of
Belazzougui and Zhu [FOCS 2016] and very recent -size sketches
of Jin, Nelson, and Wu [STACS 2021]