1,056 research outputs found
Sequential compact code learning for unsupervised image hashing
Effective hashing for large-scale image databases is a popular research area, attracting much attention in computer vision and visual information retrieval. Several recent methods attempt to learn either graph embedding or semantic coding for fast and accurate applications. In this paper, a novel unsupervised framework, termed evolutionary compact embedding (ECE), is introduced to automatically learn the task-specific binary hash codes. It can be regarded as an optimization algorithm that combines the genetic programming (GP) and a boosting trick. In our architecture, each bit of ECE is iteratively computed using a weak binary classification function, which is generated through GP evolving by jointly minimizing its empirical risk with the AdaBoost strategy on a training set. We address this as greedy optimization by embedding high-dimensional data points into a similarity-preserved Hamming space with a low dimension. We systematically evaluate ECE on two data sets, SIFT 1M and GIST 1M, showing the effectiveness and the accuracy of our method for a large-scale similarity search
Evaluation of Hashing Methods Performance on Binary Feature Descriptors
In this paper we evaluate performance of data-dependent hashing methods on
binary data. The goal is to find a hashing method that can effectively produce
lower dimensional binary representation of 512-bit FREAK descriptors. A
representative sample of recent unsupervised, semi-supervised and supervised
hashing methods was experimentally evaluated on large datasets of labelled
binary FREAK feature descriptors
A General Two-Step Approach to Learning-Based Hashing
Most existing approaches to hashing apply a single form of hash function, and
an optimization process which is typically deeply coupled to this specific
form. This tight coupling restricts the flexibility of the method to respond to
the data, and can result in complex optimization problems that are difficult to
solve. Here we propose a flexible yet simple framework that is able to
accommodate different types of loss functions and hash functions. This
framework allows a number of existing approaches to hashing to be placed in
context, and simplifies the development of new problem-specific hashing
methods. Our framework decomposes hashing learning problem into two steps: hash
bit learning and hash function learning based on the learned bits. The first
step can typically be formulated as binary quadratic problems, and the second
step can be accomplished by training standard binary classifiers. Both problems
have been extensively studied in the literature. Our extensive experiments
demonstrate that the proposed framework is effective, flexible and outperforms
the state-of-the-art.Comment: 13 pages. Appearing in Int. Conf. Computer Vision (ICCV) 201
ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks
Hash codes are efficient data representations for coping with the ever
growing amounts of data. In this paper, we introduce a random forest semantic
hashing scheme that embeds tiny convolutional neural networks (CNN) into
shallow random forests, with near-optimal information-theoretic code
aggregation among trees. We start with a simple hashing scheme, where random
trees in a forest act as hashing functions by setting `1' for the visited tree
leaf, and `0' for the rest. We show that traditional random forests fail to
generate hashes that preserve the underlying similarity between the trees,
rendering the random forests approach to hashing challenging. To address this,
we propose to first randomly group arriving classes at each tree split node
into two groups, obtaining a significantly simplified two-class classification
problem, which can be handled using a light-weight CNN weak learner. Such
random class grouping scheme enables code uniqueness by enforcing each class to
share its code with different classes in different trees. A non-conventional
low-rank loss is further adopted for the CNN weak learners to encourage code
consistency by minimizing intra-class variations and maximizing inter-class
distance for the two random class groups. Finally, we introduce an
information-theoretic approach for aggregating codes of individual trees into a
single hash code, producing a near-optimal unique hash for each class. The
proposed approach significantly outperforms state-of-the-art hashing methods
for image retrieval tasks on large-scale public datasets, while performing at
the level of other state-of-the-art image classification techniques while
utilizing a more compact and efficient scalable representation. This work
proposes a principled and robust procedure to train and deploy in parallel an
ensemble of light-weight CNNs, instead of simply going deeper.Comment: Accepted to ECCV 201
- …