3,301 research outputs found
Composite Correlation Quantization for Efficient Multimodal Retrieval
Efficient similarity retrieval from large-scale multimodal database is
pervasive in modern search engines and social networks. To support queries
across content modalities, the system should enable cross-modal correlation and
computation-efficient indexing. While hashing methods have shown great
potential in achieving this goal, current attempts generally fail to learn
isomorphic hash codes in a seamless scheme, that is, they embed multiple
modalities in a continuous isomorphic space and separately threshold embeddings
into binary codes, which incurs substantial loss of retrieval accuracy. In this
paper, we approach seamless multimodal hashing by proposing a novel Composite
Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds
correlation-maximal mappings that transform different modalities into
isomorphic latent space, and learns composite quantizers that convert the
isomorphic latent features into compact binary codes. An optimization framework
is devised to preserve both intra-modal similarity and inter-modal correlation
through minimizing both reconstruction and quantization errors, which can be
trained from both paired and partially paired data in linear time. A
comprehensive set of experiments clearly show the superior effectiveness and
efficiency of CCQ against the state of the art hashing methods for both
unimodal and cross-modal retrieval
Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search
Mobile landmark search (MLS) recently receives increasing attention for its
great practical values. However, it still remains unsolved due to two important
challenges. One is high bandwidth consumption of query transmission, and the
other is the huge visual variations of query images sent from mobile devices.
In this paper, we propose a novel hashing scheme, named as canonical view based
discrete multi-modal hashing (CV-DMH), to handle these problems via a novel
three-stage learning procedure. First, a submodular function is designed to
measure visual representativeness and redundancy of a view set. With it,
canonical views, which capture key visual appearances of landmark with limited
redundancy, are efficiently discovered with an iterative mining strategy.
Second, multi-modal sparse coding is applied to transform visual features from
multiple modalities into an intermediate representation. It can robustly and
adaptively characterize visual contents of varied landmark images with certain
canonical views. Finally, compact binary codes are learned on intermediate
representation within a tailored discrete binary embedding model which
preserves visual relations of images measured with canonical views and removes
the involved noises. In this part, we develop a new augmented Lagrangian
multiplier (ALM) based optimization method to directly solve the discrete
binary codes. We can not only explicitly deal with the discrete constraint, but
also consider the bit-uncorrelated constraint and balance constraint together.
Experiments on real world landmark datasets demonstrate the superior
performance of CV-DMH over several state-of-the-art methods
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
- …