We revisit the k-mismatch problem in the streaming model on a pattern of
length m and a streaming text of length n, both over a size-σ
alphabet. The current state-of-the-art algorithm for the streaming k-mismatch
problem, by Clifford et al. [SODA 2019], uses O~(k) space and O~(k) worst-case time per character. The space complexity is
known to be (unconditionally) optimal, and the worst-case time per character
matches a conditional lower bound. However, there is a gap between the total
time cost of the algorithm, which is O~(nk), and the fastest
known offline algorithm, which costs O~(n+min(mnk,σn)) time. Moreover, it is not known whether improvements
over the O~(nk) total time are possible when using more than
O(k) space.
We address these gaps by designing a randomized streaming algorithm for the
k-mismatch problem that, given an integer parameter k≤s≤m, uses
O~(s) space and costs O~(n+min(mnk2,snk,sσnm)) total time. For s=m,
the total runtime becomes O~(n+min(mnk,σn)), which matches the time cost of the fastest offline algorithm.
Moreover, the worst-case time cost per character is still O~(k).Comment: Extended abstract to appear in CPM 202