Privately Matching kk-mers

Abstract

We construct the first noninteractive protocols for several tasks related to private set intersection. We provide efficient protocols for three related problems, each motivated by a particular kind of genomic testing. Set intersection with labelling hides the intersecting set itself and returns only the labels of the common elements, thus allowing a genomics company to return diagnoses without exposing the IP of its database. Fuzzy matching with labelling extends this to allow matching at a particular Hamming distance, which solves the same problem but incorporates the possibility of genetic variation. Closest matching returns the item in the server\u27s database closest to the client\u27s query - this can be used for taxonomic classification. Our protocols are optimised for the matching of kk-mers (sets of kk-length strings) rather than individual nucleotides, which is particularly useful for representing the short reads produced by next generation sequencing technologies

    Similar works