6,253 research outputs found
Embedding based on function approximation for large scale image search
The objective of this paper is to design an embedding method that maps local
features describing an image (e.g. SIFT) to a higher dimensional representation
useful for the image retrieval problem. First, motivated by the relationship
between the linear approximation of a nonlinear function in high dimensional
space and the stateof-the-art feature representation used in image retrieval,
i.e., VLAD, we propose a new approach for the approximation. The embedded
vectors resulted by the function approximation process are then aggregated to
form a single representation for image retrieval. Second, in order to make the
proposed embedding method applicable to large scale problem, we further derive
its fast version in which the embedded vectors can be efficiently computed,
i.e., in the closed-form. We compare the proposed embedding methods with the
state of the art in the context of image search under various settings: when
the images are represented by medium length vectors, short vectors, or binary
vectors. The experimental results show that the proposed embedding methods
outperform existing the state of the art on the standard public image retrieval
benchmarks.Comment: Accepted to TPAMI 2017. The implementation and precomputed features
of the proposed F-FAemb are released at the following link:
http://tinyurl.com/F-FAem
Supervised Hashing with End-to-End Binary Deep Neural Network
Image hashing is a popular technique applied to large scale content-based
visual retrieval due to its compact and efficient binary codes. Our work
proposes a new end-to-end deep network architecture for supervised hashing
which directly learns binary codes from input images and maintains good
properties over binary codes such as similarity preservation, independence, and
balancing. Furthermore, we also propose a new learning scheme that can cope
with the binary constrained loss function. The proposed algorithm not only is
scalable for learning over large-scale datasets but also outperforms
state-of-the-art supervised hashing methods, which are illustrated throughout
extensive experiments from various image retrieval benchmarks.Comment: Accepted to IEEE ICIP 201
Selective Deep Convolutional Features for Image Retrieval
Convolutional Neural Network (CNN) is a very powerful approach to extract
discriminative local descriptors for effective image search. Recent work adopts
fine-tuned strategies to further improve the discriminative power of the
descriptors. Taking a different approach, in this paper, we propose a novel
framework to achieve competitive retrieval performance. Firstly, we propose
various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a
representative subset of local convolutional features and remove a large number
of redundant features. We demonstrate that this can effectively address the
burstiness issue and improve retrieval accuracy. Secondly, we propose to employ
recent embedding and aggregating methods to further enhance feature
discriminability. Extensive experiments demonstrate that our proposed framework
achieves state-of-the-art retrieval accuracy.Comment: Accepted to ACM MM 201
Volumetric 3D Point Cloud Attribute Compression: Learned polynomial bilateral filter for prediction
We extend a previous study on 3D point cloud attribute compression scheme
that uses a volumetric approach: given a target volumetric attribute function
, we quantize and encode parameters
that characterize at the encoder, for reconstruction
at known 3D points at the decoder.
Specifically, parameters are quantized coefficients of B-spline basis
vectors (for order ) that span the function space
at a particular resolution , which are coded from
coarse to fine resolutions for scalability. In this work, we focus on the
prediction of finer-grained coefficients given coarser-grained ones by learning
parameters of a polynomial bilateral filter (PBF) from data. PBF is a
pseudo-linear filter that is signal-dependent with a graph spectral
interpretation common in the graph signal processing (GSP) field. We
demonstrate PBF's predictive performance over a linear predictor inspired by
MPEG standardization over a wide range of point cloud datasets
Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression
We study 3D point cloud attribute compression via a volumetric approach:
assuming point cloud geometry is known at both encoder and decoder, parameters
of a continuous attribute function are quantized to and encoded, so that discrete
samples can be recovered at known 3D points
at the decoder. Specifically, we consider a
nested sequences of function subspaces , where is a family
of functions spanned by B-spline basis functions of order , is the
projection of on and encoded as low-pass coefficients
, and is the residual function in orthogonal subspace
(where ) and encoded as high-pass coefficients . In
this paper, to improve coding performance over [1], we study predicting
at level given at level and encoding of
for the case (RAHT()). For the prediction, we formalize RAHT(1) linear
prediction in MPEG-PCC in a theoretical framework, and propose a new nonlinear
predictor using a polynomial of bilateral filter. We derive equations to
efficiently compute the critically sampled high-pass coefficients
amenable to encoding. We optimize parameters in our resulting feed-forward
network on a large training set of point clouds by minimizing a rate-distortion
Lagrangian. Experimental results show that our improved framework outperformed
the MPEG G-PCC predictor by to in bit rate reduction
- β¦