6,059 research outputs found
Scalable Image Retrieval by Sparse Product Quantization
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional
feature indexing and retrieval is the crux of large-scale image retrieval. A
recent promising technique is Product Quantization, which attempts to index
high-dimensional image features by decomposing the feature space into a
Cartesian product of low dimensional subspaces and quantizing each of them
separately. Despite the promising results reported, their quantization approach
follows the typical hard assignment of traditional quantization methods, which
may result in large quantization errors and thus inferior search performance.
Unlike the existing approaches, in this paper, we propose a novel approach
called Sparse Product Quantization (SPQ) to encoding the high-dimensional
feature vectors into sparse representation. We optimize the sparse
representations of the feature vectors by minimizing their quantization errors,
making the resulting representation is essentially close to the original data
in practice. Experiments show that the proposed SPQ technique is not only able
to compress data, but also an effective encoding technique. We obtain
state-of-the-art results for ANN search on four public image datasets and the
promising results of content-based image retrieval further validate the
efficacy of our proposed method.Comment: 12 page
Generalized residual vector quantization for large scale data
Vector quantization is an essential tool for tasks involving large scale
data, for example, large scale similarity search, which is crucial for
content-based information retrieval and analysis. In this paper, we propose a
novel vector quantization framework that iteratively minimizes quantization
error. First, we provide a detailed review on a relevant vector quantization
method named \textit{residual vector quantization} (RVQ). Next, we propose
\textit{generalized residual vector quantization} (GRVQ) to further improve
over RVQ. Many vector quantization methods can be viewed as the special cases
of our proposed framework. We evaluate GRVQ on several large scale benchmark
datasets for large scale search, classification and object retrieval. We
compared GRVQ with existing methods in detail. Extensive experiments
demonstrate our GRVQ framework substantially outperforms existing methods in
term of quantization accuracy and computation efficiency.Comment: published on International Conference on Multimedia and Expo 201
Orientation covariant aggregation of local descriptors with embeddings
Image search systems based on local descriptors typically achieve orientation
invariance by aligning the patches on their dominant orientations. Albeit
successful, this choice introduces too much invariance because it does not
guarantee that the patches are rotated consistently. This paper introduces an
aggregation strategy of local descriptors that achieves this covariance
property by jointly encoding the angle in the aggregation stage in a continuous
manner. It is combined with an efficient monomial embedding to provide a
codebook-free method to aggregate local descriptors into a single vector
representation. Our strategy is also compatible and employed with several
popular encoding methods, in particular bag-of-words, VLAD and the Fisher
vector. Our geometric-aware aggregation strategy is effective for image search,
as shown by experiments performed on standard benchmarks for image and
particular object retrieval, namely Holidays and Oxford buildings.Comment: European Conference on Computer Vision (2014
Learning a Complete Image Indexing Pipeline
To work at scale, a complete image indexing system comprises two components:
An inverted file index to restrict the actual search to only a subset that
should contain most of the items relevant to the query; An approximate distance
computation mechanism to rapidly scan these lists. While supervised deep
learning has recently enabled improvements to the latter, the former continues
to be based on unsupervised clustering in the literature. In this work, we
propose a first system that learns both components within a unifying neural
framework of structured binary encoding
Learning a Complete Image Indexing Pipeline
To work at scale, a complete image indexing system comprises two components:
An inverted file index to restrict the actual search to only a subset that
should contain most of the items relevant to the query; An approximate distance
computation mechanism to rapidly scan these lists. While supervised deep
learning has recently enabled improvements to the latter, the former continues
to be based on unsupervised clustering in the literature. In this work, we
propose a first system that learns both components within a unifying neural
framework of structured binary encoding
- …