Search CORE

55,184 research outputs found

The streaming $k$ -mismatch problem

Author: Clifford Raphaël
Kociumaka Tomasz
Porat Ely
Publication venue
Publication date: 09/04/2018
Field of study

We consider the streaming complexity of a fundamental task in approximate pattern matching: the

k

-mismatch problem. It asks to compute Hamming distances between a pattern of length

n

and all length-

n

substrings of a text for which the Hamming distance does not exceed a given threshold

k

. In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the

k

-mismatch problem which uses

O(k\log{n}\log\frac{n}{k})

bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed by the text. The running time almost matches the classic offline solution and the space usage is within a logarithmic factor of optimal. Our new algorithm therefore effectively resolves and also extends an open problem first posed in FOCS'09. En route to this solution, we also give a deterministic

O( k (\log \frac{n}{k} + \log |\Sigma|) )

-bit encoding of all the alignments with Hamming distance at most

k

of a length-

n

pattern within a text of length

O(n)

. This secondary result provides an optimal solution to a natural communication complexity problem which may be of independent interest.Comment: 27 page

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Classification and Galois conjugacy of Hamming maps

Author: Jones Gareth A.
Publication venue
Publication date: 02/06/2010
Field of study

We show that for each d>0 the d-dimensional Hamming graph H(d,q) has an orientably regular surface embedding if and only if q is a prime power p^e. If q>2 there are up to isomorphism \phi(q-1)/e such maps, all constructed as Cayley maps for a d-dimensional vector space over the field of order q. We show that for each such pair d, q the corresponding Belyi pairs are conjugate under the action of the absolute Galois group, and we determine their minimal field of definition. We also classify the orientably regular embedding of merged Hamming graphs for q>3

arXiv.org e-Print Archive

CiteSeerX

Clustering in Hilbert space of a quantum optimization problem

Author: Hsu B.
Laumann C. R.
Moessner R.
Morampudi S. C.
Sondhi S. L.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/04/2017
Field of study

The solution space of many classical optimization problems breaks up into clusters which are extensively distant from one another in the Hamming metric. Here, we show that an analogous quantum clustering phenomenon takes place in the ground state subspace of a certain quantum optimization problem. This involves extending the notion of clustering to Hilbert space, where the classical Hamming distance is not immediately useful. Quantum clusters correspond to macroscopically distinct subspaces of the full quantum ground state space which grow with the system size. We explicitly demonstrate that such clusters arise in the solution space of random quantum satisfiability (3-QSAT) at its satisfiability transition. We estimate both the number of these clusters and their internal entropy. The former are given by the number of hardcore dimer coverings of the core of the interaction graph, while the latter is related to the underconstrained degrees of freedom not touched by the dimers. We additionally provide new numerical evidence suggesting that the 3-QSAT satisfiability transition may coincide with the product satisfiability transition, which would imply the absence of an intermediate entangled satisfiable phase.Comment: 11 pages, 6 figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Optimal Las Vegas Locality Sensitive Data Structures

Author: Ahle Thomas Dybdahl
Publication venue
Publication date: 01/10/2017
Field of study

We show that approximate similarity (near neighbour) search can be solved in high dimensions with performance matching state of the art (data independent) Locality Sensitive Hashing, but with a guarantee of no false negatives. Specifically, we give two data structures for common problems. For

c

-approximate near neighbour in Hamming space we get query time

dn^{1/c+o(1)}

and space

dn^{1+1/c+o(1)}

matching that of \cite{indyk1998approximate} and answering a long standing open question from~\cite{indyk2000dimensionality} and~\cite{pagh2016locality} in the affirmative. By means of a new deterministic reduction from

\ell_1

to Hamming we also solve

\ell_1

and

\ell_2

with query time

d^2n^{1/c+o(1)}

and space

d^2 n^{1+1/c+o(1)}

. For

(s_1,s_2)

-approximate Jaccard similarity we get query time

dn^{\rho+o(1)}

and space

dn^{1+\rho+o(1)}

\rho=\log\frac{1+s_1}{2s_1}\big/\log\frac{1+s_2}{2s_2}

, when sets have equal size, matching the performance of~\cite{tobias2016}. The algorithms are based on space partitions, as with classic LSH, but we construct these using a combination of brute force, tensoring, perfect hashing and splitter functions \`a la~\cite{naor1995splitters}. We also show a new dimensionality reduction lemma with 1-sided error

arXiv.org e-Print Archive

Crossref

Transfer Adversarial Hashing for Hamming Space Retrieval

Author: Cao Zhangjie
Huang Chao
Long Mingsheng
Wang Jianmin
Publication venue
Publication date: 13/12/2017
Field of study

Hashing is widely applied to large-scale image retrieval due to the storage and retrieval efficiency. Existing work on deep hashing assumes that the database in the target domain is identically distributed with the training set in the source domain. This paper relaxes this assumption to a transfer retrieval setting, which allows the database and the training set to come from different but relevant domains. However, the transfer retrieval setting will introduce two technical difficulties: first, the hash model trained on the source domain cannot work well on the target domain due to the large distribution gap; second, the domain gap makes it difficult to concentrate the database points to be within a small Hamming ball. As a consequence, transfer retrieval performance within Hamming Radius 2 degrades significantly in existing hashing methods. This paper presents Transfer Adversarial Hashing (TAH), a new hybrid deep architecture that incorporates a pairwise

t

-distribution cross-entropy loss to learn concentrated hash codes and an adversarial network to align the data distributions between the source and target domains. TAH can generate compact transfer hash codes for efficient image retrieval on both source and target domains. Comprehensive experiments validate that TAH yields state of the art Hamming space retrieval performance on standard datasets

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications