99 research outputs found
Evaluation of Hashing Methods Performance on Binary Feature Descriptors
In this paper we evaluate performance of data-dependent hashing methods on
binary data. The goal is to find a hashing method that can effectively produce
lower dimensional binary representation of 512-bit FREAK descriptors. A
representative sample of recent unsupervised, semi-supervised and supervised
hashing methods was experimentally evaluated on large datasets of labelled
binary FREAK feature descriptors
Unsupervised Deep Hashing for Large-scale Visual Search
Learning based hashing plays a pivotal role in large-scale visual search.
However, most existing hashing algorithms tend to learn shallow models that do
not seek representative binary codes. In this paper, we propose a novel hashing
approach based on unsupervised deep learning to hierarchically transform
features into hash codes. Within the heterogeneous deep hashing framework, the
autoencoder layers with specific constraints are considered to model the
nonlinear mapping between features and binary codes. Then, a Restricted
Boltzmann Machine (RBM) layer with constraints is utilized to reduce the
dimension in the hamming space. Extensive experiments on the problem of visual
search demonstrate the competitiveness of our proposed approach compared to
state-of-the-art
Online supervised hashing
Fast nearest neighbor search is becoming more and more crucial given the advent of large-scale data in many computer vision applications. Hashing approaches provide both fast search mechanisms and compact index structures to address this critical need. In image retrieval problems where labeled training data is available, supervised hashing methods prevail over unsupervised methods. Most state-of-the-art supervised hashing approaches employ batch-learners. Unfortunately, batch-learning strategies may be inefficient when confronted with large datasets. Moreover, with batch-learners, it is unclear how to adapt the hash functions as the dataset continues to grow and new variations appear over time. To handle these issues, we propose OSH: an Online Supervised Hashing technique that is based on Error Correcting Output Codes. We consider a stochastic setting where the data arrives sequentially and our method learns and adapts its hashing functions in a discriminative manner. Our method makes no assumption about the number of possible class labels, and accommodates new classes as they are presented in the incoming data stream. In experiments with three image retrieval benchmarks, our method yields state-of-the-art retrieval performance as measured in Mean Average Precision, while also being orders-of-magnitude faster than competing batch methods for supervised hashing. Also, our method significantly outperforms recently introduced online hashing solutions.https://pdfs.semanticscholar.org/555b/de4f14630d8606e37096235da8933df228f1.pdfAccepted manuscrip
Unsupervised Triplet Hashing for Fast Image Retrieval
Hashing has played a pivotal role in large-scale image retrieval. With the
development of Convolutional Neural Network (CNN), hashing learning has shown
great promise. But existing methods are mostly tuned for classification, which
are not optimized for retrieval tasks, especially for instance-level retrieval.
In this study, we propose a novel hashing method for large-scale image
retrieval. Considering the difficulty in obtaining labeled datasets for image
retrieval task in large scale, we propose a novel CNN-based unsupervised
hashing method, namely Unsupervised Triplet Hashing (UTH). The unsupervised
hashing network is designed under the following three principles: 1) more
discriminative representations for image retrieval; 2) minimum quantization
loss between the original real-valued feature descriptors and the learned hash
codes; 3) maximum information entropy for the learned hash codes. Extensive
experiments on CIFAR-10, MNIST and In-shop datasets have shown that UTH
outperforms several state-of-the-art unsupervised hashing methods in terms of
retrieval accuracy
Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization
In this paper, we address the problem of learning compact
similarity-preserving embeddings for massive high-dimensional streams of data
in order to perform efficient similarity search. We present a new online method
for computing binary compressed representations -sketches- of high-dimensional
real feature vectors. Given an expected code length and high-dimensional
input data points, our algorithm provides a -bits binary code for preserving
the distance between the points from the original high-dimensional space. Our
algorithm does not require neither the storage of the whole dataset nor a
chunk, thus it is fully adaptable to the streaming setting. It also provides
low time complexity and convergence guarantees. We demonstrate the quality of
our binary sketches through experiments on real data for the nearest neighbors
search task in the online setting
10,000+ Times Accelerated Robust Subset Selection (ARSS)
Subset selection from massive data with noised information is increasingly
popular for various applications. This problem is still highly challenging as
current methods are generally slow in speed and sensitive to outliers. To
address the above two issues, we propose an accelerated robust subset selection
(ARSS) method. Specifically in the subset selection area, this is the first
attempt to employ the -norm based measure for the
representation loss, preventing large errors from dominating our objective. As
a result, the robustness against outlier elements is greatly enhanced.
Actually, data size is generally much larger than feature length, i.e. . Based on this observation, we propose a speedup solver (via ALM and
equivalent derivations) to highly reduce the computational cost, theoretically
from to . Extensive experiments on ten benchmark
datasets verify that our method not only outperforms state of the art methods,
but also runs 10,000+ times faster than the most related method
- …