We show that approximate similarity (near neighbour) search can be solved in
high dimensions with performance matching state of the art (data independent)
Locality Sensitive Hashing, but with a guarantee of no false negatives.
Specifically, we give two data structures for common problems.
For c-approximate near neighbour in Hamming space we get query time
dn1/c+o(1) and space dn1+1/c+o(1) matching that of
\cite{indyk1998approximate} and answering a long standing open question
from~\cite{indyk2000dimensionality} and~\cite{pagh2016locality} in the
affirmative.
By means of a new deterministic reduction from ℓ1 to Hamming we also
solve ℓ1 and ℓ2 with query time d2n1/c+o(1) and space d2n1+1/c+o(1).
For (s1,s2)-approximate Jaccard similarity we get query time
dnρ+o(1) and space dn1+ρ+o(1),
ρ=log2s11+s1/log2s21+s2, when sets have equal
size, matching the performance of~\cite{tobias2016}.
The algorithms are based on space partitions, as with classic LSH, but we
construct these using a combination of brute force, tensoring, perfect hashing
and splitter functions \`a la~\cite{naor1995splitters}. We also show a new
dimensionality reduction lemma with 1-sided error