799 research outputs found
Simultaneous Feature Learning and Hash Coding with Deep Neural Networks
Similarity-preserving hashing is a widely-used method for nearest neighbour
search in large-scale image retrieval tasks. For most existing hashing methods,
an image is first encoded as a vector of hand-engineering visual features,
followed by another separate projection or quantization step that generates
binary codes. However, such visual feature vectors may not be optimally
compatible with the coding process, thus producing sub-optimal hashing codes.
In this paper, we propose a deep architecture for supervised hashing, in which
images are mapped into binary codes via carefully designed deep neural
networks. The pipeline of the proposed deep architecture consists of three
building blocks: 1) a sub-network with a stack of convolution layers to produce
the effective intermediate image features; 2) a divide-and-encode module to
divide the intermediate image features into multiple branches, each encoded
into one hash bit; and 3) a triplet ranking loss designed to characterize that
one image is more similar to the second image than to the third one. Extensive
evaluations on several benchmark image datasets show that the proposed
simultaneous feature learning and hash coding pipeline brings substantial
improvements over other state-of-the-art supervised or unsupervised hashing
methods.Comment: This paper has been accepted to IEEE International Conference on
Pattern Recognition and Computer Vision (CVPR), 201
Orthonormal Product Quantization Network for Scalable Face Image Retrieval
Recently, deep hashing with Hamming distance metric has drawn increasing
attention for face image retrieval tasks. However, its counterpart deep
quantization methods, which learn binary code representations with
dictionary-related distance metrics, have seldom been explored for the task.
This paper makes the first attempt to integrate product quantization into an
end-to-end deep learning framework for face image retrieval. Unlike prior deep
quantization methods where the codewords for quantization are learned from
data, we propose a novel scheme using predefined orthonormal vectors as
codewords, which aims to enhance the quantization informativeness and reduce
the codewords' redundancy. To make the most of the discriminative information,
we design a tailored loss function that maximizes the identity discriminability
in each quantization subspace for both the quantized and the original features.
Furthermore, an entropy-based regularization term is imposed to reduce the
quantization error. We conduct experiments on three commonly-used datasets
under the settings of both single-domain and cross-domain retrieval. It shows
that the proposed method outperforms all the compared deep hashing/quantization
methods under both settings with significant superiority. The proposed
codewords scheme consistently improves both regular model performance and model
generalization ability, verifying the importance of codewords' distribution for
the quantization quality. Besides, our model's better generalization ability
than deep hashing models indicates that it is more suitable for scalable face
image retrieval tasks
- …