26,426 research outputs found
Online Product Quantization
Approximate nearest neighbor (ANN) search has achieved great success in many
tasks. However, existing popular methods for ANN search, such as hashing and
quantization methods, are designed for static databases only. They cannot
handle well the database with data distribution evolving dynamically, due to
the high computational effort for retraining the model based on the new
database. In this paper, we address the problem by developing an online product
quantization (online PQ) model and incrementally updating the quantization
codebook that accommodates to the incoming streaming data. Moreover, to further
alleviate the issue of large scale computation for the online PQ update, we
design two budget constraints for the model to update partial PQ codebook
instead of all. We derive a loss bound which guarantees the performance of our
online PQ model. Furthermore, we develop an online PQ model over a sliding
window with both data insertion and deletion supported, to reflect the
real-time behaviour of the data. The experiments demonstrate that our online PQ
model is both time-efficient and effective for ANN search in dynamic large
scale databases compared with baseline methods and the idea of partial PQ
codebook update further reduces the update cost.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering
(DOI: 10.1109/TKDE.2018.2817526
Product Markovian quantization of an R^d -valued Euler scheme of a diffusion process with applications to finance
We introduce a new approach to quantize the Euler scheme of an
-valued diffusion process. This method is based on a Markovian
and componentwise product quantization and allows us, from a numerical point of
view, to speak of {\em fast online quantization} in dimension greater than one
since the product quantization of the Euler scheme of the diffusion process and
its companion weights and transition probabilities may be computed quite
instantaneously. We show that the resulting quantization process is a Markov
chain, then, we compute the associated companion weights and transition
probabilities from (semi-) closed formulas. From the analytical point of view,
we show that the induced quantization errors at the -th discretization step
is a cumulative of the marginal quantization error up to time .
Numerical experiments are performed for the pricing of a Basket call option,
for the pricing of a European call option in a Heston model and for the
approximation of the solution of backward stochastic differential equations to
show the performances of the method
Generalized residual vector quantization for large scale data
Vector quantization is an essential tool for tasks involving large scale
data, for example, large scale similarity search, which is crucial for
content-based information retrieval and analysis. In this paper, we propose a
novel vector quantization framework that iteratively minimizes quantization
error. First, we provide a detailed review on a relevant vector quantization
method named \textit{residual vector quantization} (RVQ). Next, we propose
\textit{generalized residual vector quantization} (GRVQ) to further improve
over RVQ. Many vector quantization methods can be viewed as the special cases
of our proposed framework. We evaluate GRVQ on several large scale benchmark
datasets for large scale search, classification and object retrieval. We
compared GRVQ with existing methods in detail. Extensive experiments
demonstrate our GRVQ framework substantially outperforms existing methods in
term of quantization accuracy and computation efficiency.Comment: published on International Conference on Multimedia and Expo 201
Scalable Image Retrieval by Sparse Product Quantization
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional
feature indexing and retrieval is the crux of large-scale image retrieval. A
recent promising technique is Product Quantization, which attempts to index
high-dimensional image features by decomposing the feature space into a
Cartesian product of low dimensional subspaces and quantizing each of them
separately. Despite the promising results reported, their quantization approach
follows the typical hard assignment of traditional quantization methods, which
may result in large quantization errors and thus inferior search performance.
Unlike the existing approaches, in this paper, we propose a novel approach
called Sparse Product Quantization (SPQ) to encoding the high-dimensional
feature vectors into sparse representation. We optimize the sparse
representations of the feature vectors by minimizing their quantization errors,
making the resulting representation is essentially close to the original data
in practice. Experiments show that the proposed SPQ technique is not only able
to compress data, but also an effective encoding technique. We obtain
state-of-the-art results for ANN search on four public image datasets and the
promising results of content-based image retrieval further validate the
efficacy of our proposed method.Comment: 12 page
Efficient Large-scale Approximate Nearest Neighbor Search on the GPU
We present a new approach for efficient approximate nearest neighbor (ANN)
search in high dimensional spaces, extending the idea of Product Quantization.
We propose a two-level product and vector quantization tree that reduces the
number of vector comparisons required during tree traversal. Our approach also
includes a novel highly parallelizable re-ranking method for candidate vectors
by efficiently reusing already computed intermediate values. Due to its small
memory footprint during traversal, the method lends itself to an efficient,
parallel GPU implementation. This Product Quantization Tree (PQT) approach
significantly outperforms recent state of the art methods for high dimensional
nearest neighbor queries on standard reference datasets. Ours is the first work
that demonstrates GPU performance superior to CPU performance on high
dimensional, large scale ANN problems in time-critical real-world applications,
like loop-closing in videos
- …