Differentially Private Approximate Pattern Matching

Abstract

In this paper, we consider the kk-approximate pattern matching problem under differential privacy, where the goal is to report or count all substrings of a given string SS which have a Hamming distance at most kk to a pattern PP, or decide whether such a substring exists. In our definition of privacy, individual positions of the string SS are protected. To be able to answer queries under differential privacy, we allow some slack on kk, i.e. we allow reporting or counting substrings of SS with a distance at most (1+γ)k+α(1+\gamma)k+\alpha to PP, for a multiplicative error γ\gamma and an additive error α\alpha. We analyze which values of α\alpha and γ\gamma are necessary or sufficient to solve the kk-approximate pattern matching problem while satisfying ϵ\epsilon-differential privacy. Let nn denote the length of SS. We give 1) an ϵ\epsilon-differentially private algorithm with an additive error of O(ϵ1logn)O(\epsilon^{-1}\log n) and no multiplicative error for the existence variant; 2) an ϵ\epsilon-differentially private algorithm with an additive error O(ϵ1max(k,logn)logn)O(\epsilon^{-1}\max(k,\log n)\cdot\log n) for the counting variant; 3) an ϵ\epsilon-differentially private algorithm with an additive error of O(ϵ1logn)O(\epsilon^{-1}\log n) and multiplicative error O(1)O(1) for the reporting variant for a special class of patterns. The error bounds hold with high probability. All of these algorithms return a witness, that is, if there exists a substring of SS with distance at most kk to PP, then the algorithm returns a substring of SS with distance at most (1+γ)k+α(1+\gamma)k+\alpha to PP. Further, we complement these results by a lower bound, showing that any algorithm for the existence variant which also returns a witness must have an additive error of Ω(ϵ1logn)\Omega(\epsilon^{-1}\log n) with constant probability.Comment: This is a full version of a paper accepted to ITCS 202

    Similar works

    Full text

    thumbnail-image

    Available Versions