1,280 research outputs found
Deep Discrete Hashing with Self-supervised Pairwise Labels
Hashing methods have been widely used for applications of large-scale image
retrieval and classification. Non-deep hashing methods using handcrafted
features have been significantly outperformed by deep hashing methods due to
their better feature representation and end-to-end learning framework. However,
the most striking successes in deep hashing have mostly involved discriminative
models, which require labels. In this paper, we propose a novel unsupervised
deep hashing method, named Deep Discrete Hashing (DDH), for large-scale image
retrieval and classification. In the proposed framework, we address two main
problems: 1) how to directly learn discrete binary codes? 2) how to equip the
binary representation with the ability of accurate image retrieval and
classification in an unsupervised way? We resolve these problems by introducing
an intermediate variable and a loss function steering the learning process,
which is based on the neighborhood structure in the original space.
Experimental results on standard datasets (CIFAR-10, NUS-WIDE, and Oxford-17)
demonstrate that our DDH significantly outperforms existing hashing methods by
large margin in terms of~mAP for image retrieval and object recognition. Code
is available at \url{https://github.com/htconquer/ddh}
Learning to Navigate the Energy Landscape
In this paper, we present a novel and efficient architecture for addressing
computer vision problems that use `Analysis by Synthesis'. Analysis by
synthesis involves the minimization of the reconstruction error which is
typically a non-convex function of the latent target variables.
State-of-the-art methods adopt a hybrid scheme where discriminatively trained
predictors like Random Forests or Convolutional Neural Networks are used to
initialize local search algorithms. While these methods have been shown to
produce promising results, they often get stuck in local optima. Our method
goes beyond the conventional hybrid architecture by not only proposing multiple
accurate initial solutions but by also defining a navigational structure over
the solution space that can be used for extremely efficient gradient-free local
search. We demonstrate the efficacy of our approach on the challenging problem
of RGB Camera Relocalization. To make the RGB camera relocalization problem
particularly challenging, we introduce a new dataset of 3D environments which
are significantly larger than those found in other publicly-available datasets.
Our experiments reveal that the proposed method is able to achieve
state-of-the-art camera relocalization results. We also demonstrate the
generalizability of our approach on Hand Pose Estimation and Image Retrieval
tasks
Fast Exact Search in Hamming Space with Multi-Index Hashing
There is growing interest in representing image data and feature descriptors
using compact binary codes for fast near neighbor search. Although binary codes
are motivated by their use as direct indices (addresses) into a hash table,
codes longer than 32 bits are not being used as such, as it was thought to be
ineffective. We introduce a rigorous way to build multiple hash tables on
binary code substrings that enables exact k-nearest neighbor search in Hamming
space. The approach is storage efficient and straightforward to implement.
Theoretical analysis shows that the algorithm exhibits sub-linear run-time
behavior for uniformly distributed codes. Empirical results show dramatic
speedups over a linear scan baseline for datasets of up to one billion codes of
64, 128, or 256 bits
- …