LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)
The shift distance sh(S1,S2) between two strings S1 and S2
of the same length is defined as the minimum Hamming distance between S1 and
any rotation (cyclic shift) of S2. We study the problem of sketching the
shift distance, which is the following communication complexity problem:
Strings S1 and S2 of length n are given to two identical players
(encoders), who independently compute sketches (summaries) sk(S1)
and sk(S2), respectively, so that upon receiving the two sketches,
a third player (decoder) is able to compute (or approximate)
sh(S1,S2) with high probability.
This paper primarily focuses on the more general k-mismatch version of the
problem, where the decoder is allowed to declare a failure if
sh(S1,S2)>k, where k is a parameter known to all parties. Andoni
et al. (STOC'13) introduced exact circular k-mismatch sketches of size
O(k+D(n)), where D(n) is the number of divisors of n. Andoni
et al. also showed that their sketch size is optimal in the class of linear
homomorphic sketches.
We circumvent this lower bound by designing a (non-linear) exact circular
k-mismatch sketch of size O(k); this size matches
communication-complexity lower bounds. We also design (1±ε)-approximate circular k-mismatch sketch of size
O(min(ε−2k,ε−1.5n)),
which improves upon an O(ε−2n)-size sketch of
Crouch and McGregor (APPROX'11)