19,206 research outputs found
Deep Discrete Hashing with Self-supervised Pairwise Labels
Hashing methods have been widely used for applications of large-scale image
retrieval and classification. Non-deep hashing methods using handcrafted
features have been significantly outperformed by deep hashing methods due to
their better feature representation and end-to-end learning framework. However,
the most striking successes in deep hashing have mostly involved discriminative
models, which require labels. In this paper, we propose a novel unsupervised
deep hashing method, named Deep Discrete Hashing (DDH), for large-scale image
retrieval and classification. In the proposed framework, we address two main
problems: 1) how to directly learn discrete binary codes? 2) how to equip the
binary representation with the ability of accurate image retrieval and
classification in an unsupervised way? We resolve these problems by introducing
an intermediate variable and a loss function steering the learning process,
which is based on the neighborhood structure in the original space.
Experimental results on standard datasets (CIFAR-10, NUS-WIDE, and Oxford-17)
demonstrate that our DDH significantly outperforms existing hashing methods by
large margin in terms of~mAP for image retrieval and object recognition. Code
is available at \url{https://github.com/htconquer/ddh}
Deep Bottleneck Feature for Image Classification
Effective image representation plays an important role for image classification and retrieval. Bag-of-Features (BoF) is well known as an effective and robust visual representation. However, on large datasets, convolutional neural networks (CNN) tend to perform much better, aided by the availability of large amounts of training data. In this paper, we propose a bag of Deep Bottleneck Features (DBF) for image classification, effectively combining the strengths of a CNN within a BoF framework. The DBF features, obtained from a previously well-trained CNN, form a compact and low-dimensional representation of the original inputs, effective for even small datasets. We will demonstrate that the resulting BoDBF method has a very powerful and discriminative capability that is generalisable to other image classification tasks
Image Retrieval based on Bag-of-Words model
This article gives a survey for bag-of-words (BoW) or bag-of-features model
in image retrieval system. In recent years, large-scale image retrieval shows
significant potential in both industry applications and research problems. As
local descriptors like SIFT demonstrate great discriminative power in solving
vision problems like object recognition, image classification and annotation,
more and more state-of-the-art large scale image retrieval systems are trying
to rely on them. A common way to achieve this is first quantizing local
descriptors into visual words, and then applying scalable textual indexing and
retrieval schemes. We call this model as bag-of-words or bag-of-features model.
The goal of this survey is to give an overview of this model and introduce
different strategies when building the system based on this model
Hierarchy-based Image Embeddings for Semantic Image Retrieval
Deep neural networks trained for classification have been found to learn
powerful image representations, which are also often used for other tasks such
as comparing images w.r.t. their visual similarity. However, visual similarity
does not imply semantic similarity. In order to learn semantically
discriminative features, we propose to map images onto class embeddings whose
pair-wise dot products correspond to a measure of semantic similarity between
classes. Such an embedding does not only improve image retrieval results, but
could also facilitate integrating semantics for other tasks, e.g., novelty
detection or few-shot learning. We introduce a deterministic algorithm for
computing the class centroids directly based on prior world-knowledge encoded
in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds,
and ImageNet show that our learned semantic image embeddings improve the
semantic consistency of image retrieval results by a large margin.Comment: Accepted at WACV 2019. Source code:
https://github.com/cvjena/semantic-embedding
Class-Weighted Convolutional Features for Visual Instance Search
Image retrieval in realistic scenarios targets large dynamic datasets of
unlabeled images. In these cases, training or fine-tuning a model every time
new images are added to the database is neither efficient nor scalable.
Convolutional neural networks trained for image classification over large
datasets have been proven effective feature extractors for image retrieval. The
most successful approaches are based on encoding the activations of
convolutional layers, as they convey the image spatial information. In this
paper, we go beyond this spatial information and propose a local-aware encoding
of convolutional features based on semantic information predicted in the target
image. To this end, we obtain the most discriminative regions of an image using
Class Activation Maps (CAMs). CAMs are based on the knowledge contained in the
network and therefore, our approach, has the additional advantage of not
requiring external information. In addition, we use CAMs to generate object
proposals during an unsupervised re-ranking stage after a first fast search.
Our experiments on two public available datasets for instance retrieval,
Oxford5k and Paris6k, demonstrate the competitiveness of our approach
outperforming the current state-of-the-art when using off-the-shelf models
trained on ImageNet. The source code and model used in this paper are publicly
available at http://imatge-upc.github.io/retrieval-2017-cam/.Comment: To appear in the British Machine Vision Conference (BMVC), September
201
A Novel Adaptive LBP-Based Descriptor for Color Image Retrieval
In this paper, we present two approaches to extract discriminative features for color image retrieval. The proposed local texture descriptors, based on Radial Mean Local Binary Pattern (RMLBP), are called Color RMCLBP (CRMCLBP) and Prototype Data Model (PDM). RMLBP is a robust to noise descriptor which has been proposed to extract texture features of gray scale images for texture classification.
For the first descriptor, the Radial Mean Completed Local Binary Pattern is applied to channels of the color space, independently. Then, the final descriptor is achieved by concatenating the histogram of the CRMCLBP_S/M/C component of each channel. Moreover, to enhance the performance of the proposed method, the Particle Swarm Optimization (PSO) algorithm is used for feature weighting.
The second proposed descriptor, PDM, uses the three outputs of CRMCLBP (CRMCLBP_S, CRMCLBP_M, CRMCLBP_C) as discriminative features for each pixel of a color image. Then, a set of representative feature vectors are selected from each image by applying k-means clustering algorithm. This set of selected prototypes are compared by means of a new similarity measure to find the most relevant images. Finally, the weighted versions of PDM is constructed using PSO algorithm.
Our proposed methods are tested on Wang, Corel-5k, Corel-10k and Holidays datasets. The results show that our proposed methods makes an admissible tradeoff between speed and retrieval accuracy. The first descriptor enhances the state-of-the-art color texture descriptors in both aspects. The second one is a very fast retrieval algorithm which extracts discriminative features
- …