55,184 research outputs found
The streaming -mismatch problem
We consider the streaming complexity of a fundamental task in approximate
pattern matching: the -mismatch problem. It asks to compute Hamming
distances between a pattern of length and all length- substrings of a
text for which the Hamming distance does not exceed a given threshold . In
our problem formulation, we report not only the Hamming distance but also, on
demand, the full \emph{mismatch information}, that is the list of mismatched
pairs of symbols and their indices. The twin challenges of streaming pattern
matching derive from the need both to achieve small working space and also to
guarantee that every arriving input symbol is processed quickly.
We present a streaming algorithm for the -mismatch problem which uses
bits of space and spends \ourcomplexity time on
each symbol of the input stream, which consists of the pattern followed by the
text. The running time almost matches the classic offline solution and the
space usage is within a logarithmic factor of optimal.
Our new algorithm therefore effectively resolves and also extends an open
problem first posed in FOCS'09. En route to this solution, we also give a
deterministic -bit encoding of all
the alignments with Hamming distance at most of a length- pattern within
a text of length . This secondary result provides an optimal solution to
a natural communication complexity problem which may be of independent
interest.Comment: 27 page
Classification and Galois conjugacy of Hamming maps
We show that for each d>0 the d-dimensional Hamming graph H(d,q) has an
orientably regular surface embedding if and only if q is a prime power p^e. If
q>2 there are up to isomorphism \phi(q-1)/e such maps, all constructed as
Cayley maps for a d-dimensional vector space over the field of order q. We show
that for each such pair d, q the corresponding Belyi pairs are conjugate under
the action of the absolute Galois group, and we determine their minimal field
of definition. We also classify the orientably regular embedding of merged
Hamming graphs for q>3
Clustering in Hilbert space of a quantum optimization problem
The solution space of many classical optimization problems breaks up into
clusters which are extensively distant from one another in the Hamming metric.
Here, we show that an analogous quantum clustering phenomenon takes place in
the ground state subspace of a certain quantum optimization problem. This
involves extending the notion of clustering to Hilbert space, where the
classical Hamming distance is not immediately useful. Quantum clusters
correspond to macroscopically distinct subspaces of the full quantum ground
state space which grow with the system size. We explicitly demonstrate that
such clusters arise in the solution space of random quantum satisfiability
(3-QSAT) at its satisfiability transition. We estimate both the number of these
clusters and their internal entropy. The former are given by the number of
hardcore dimer coverings of the core of the interaction graph, while the latter
is related to the underconstrained degrees of freedom not touched by the
dimers. We additionally provide new numerical evidence suggesting that the
3-QSAT satisfiability transition may coincide with the product satisfiability
transition, which would imply the absence of an intermediate entangled
satisfiable phase.Comment: 11 pages, 6 figure
Optimal Las Vegas Locality Sensitive Data Structures
We show that approximate similarity (near neighbour) search can be solved in
high dimensions with performance matching state of the art (data independent)
Locality Sensitive Hashing, but with a guarantee of no false negatives.
Specifically, we give two data structures for common problems.
For -approximate near neighbour in Hamming space we get query time
and space matching that of
\cite{indyk1998approximate} and answering a long standing open question
from~\cite{indyk2000dimensionality} and~\cite{pagh2016locality} in the
affirmative.
By means of a new deterministic reduction from to Hamming we also
solve and with query time and space .
For -approximate Jaccard similarity we get query time
and space ,
, when sets have equal
size, matching the performance of~\cite{tobias2016}.
The algorithms are based on space partitions, as with classic LSH, but we
construct these using a combination of brute force, tensoring, perfect hashing
and splitter functions \`a la~\cite{naor1995splitters}. We also show a new
dimensionality reduction lemma with 1-sided error
Transfer Adversarial Hashing for Hamming Space Retrieval
Hashing is widely applied to large-scale image retrieval due to the storage
and retrieval efficiency. Existing work on deep hashing assumes that the
database in the target domain is identically distributed with the training set
in the source domain. This paper relaxes this assumption to a transfer
retrieval setting, which allows the database and the training set to come from
different but relevant domains. However, the transfer retrieval setting will
introduce two technical difficulties: first, the hash model trained on the
source domain cannot work well on the target domain due to the large
distribution gap; second, the domain gap makes it difficult to concentrate the
database points to be within a small Hamming ball. As a consequence, transfer
retrieval performance within Hamming Radius 2 degrades significantly in
existing hashing methods. This paper presents Transfer Adversarial Hashing
(TAH), a new hybrid deep architecture that incorporates a pairwise
-distribution cross-entropy loss to learn concentrated hash codes and an
adversarial network to align the data distributions between the source and
target domains. TAH can generate compact transfer hash codes for efficient
image retrieval on both source and target domains. Comprehensive experiments
validate that TAH yields state of the art Hamming space retrieval performance
on standard datasets
- …
