21,348 research outputs found
Few but Informative Local Hash Code Matching for Image Retrieval
Content-based image retrieval (CBIR) aims to search for the most similar images from an extensive database to a given query content. Existing CBIR works either represent each image with a compact global feature vector or extract a large number of highly compressed low-dimensional local features, where each contains limited information. In this research study, we propose an expressive local feature extraction pipeline and a many-to-many local feature matching method for large-scale CBIR. Unlike existing local feature methods, which tend to extract large amounts of low-dimensional local features from each image, the proposed method models characteristic feature representations for each image, aiming to employ fewer but more expressive local features. For further improving the results, an end-to-end trainable hash encoding layer is used for extracting compact but informative codes from images. The proposed many-to-many local feature matching is then directly performed on the hash feature vectors from input images, leading to new state-of-the-art performance on several benchmark datasets
Edge-Directed Invariant Shoeprint Image Retrieval
In this paper, we propose the use of image feature points for the classification of shoeprint images in a forensic setting. These feature points are quantified using wavelet maxima points extracted from a nonorthogonal wavelet decomposition of the shoeprint images. Wavelet transforms have been shown to be an effective analysis tool for image indexing, retrieval and characterization. This effectiveness is mainly attributed to the ability of the latter transforms to capture well the spatial information and visual features of the analyzed images using only few dominant subband coefficients. In this work, we propose the use of a nonorthogonal multiresolution representation to achieve shift-invariance. To reduce the content redundancy, we limit the feature space to wavelet maxima points. Such dimensionality reduction enables compact image representation while satisfying the requirements of the "information-preserving" rule. Based on the wavelet maxima representations, we suggest the use of a variance-weighted minimum distance measure as the similarity metric for image query, retrieval, and search purposes. As a result, each image is indexed by a vector in the wavelet maxima moment space. Finally, performance results are reported to illustrate the robustness of the extracted features in searching and retrieving of shoeprint images independently of position, size, orientation and image background
Aggregated Deep Local Features for Remote Sensing Image Retrieval
Remote Sensing Image Retrieval remains a challenging topic due to the special
nature of Remote Sensing Imagery. Such images contain various different
semantic objects, which clearly complicates the retrieval task. In this paper,
we present an image retrieval pipeline that uses attentive, local convolutional
features and aggregates them using the Vector of Locally Aggregated Descriptors
(VLAD) to produce a global descriptor. We study various system parameters such
as the multiplicative and additive attention mechanisms and descriptor
dimensionality. We propose a query expansion method that requires no external
inputs. Experiments demonstrate that even without training, the local
convolutional features and global representation outperform other systems.
After system tuning, we can achieve state-of-the-art or competitive results.
Furthermore, we observe that our query expansion method increases overall
system performance by about 3%, using only the top-three retrieved images.
Finally, we show how dimensionality reduction produces compact descriptors with
increased retrieval performance and fast retrieval computation times, e.g. 50%
faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal
contributio
Object Level Deep Feature Pooling for Compact Image Representation
Convolutional Neural Network (CNN) features have been successfully employed
in recent works as an image descriptor for various vision tasks. But the
inability of the deep CNN features to exhibit invariance to geometric
transformations and object compositions poses a great challenge for image
search. In this work, we demonstrate the effectiveness of the objectness prior
over the deep CNN features of image regions for obtaining an invariant image
representation. The proposed approach represents the image as a vector of
pooled CNN features describing the underlying objects. This representation
provides robustness to spatial layout of the objects in the scene and achieves
invariance to general geometric transformations, such as translation, rotation
and scaling. The proposed approach also leads to a compact representation of
the scene, making each image occupy a smaller memory footprint. Experiments
show that the proposed representation achieves state of the art retrieval
results on a set of challenging benchmark image datasets, while maintaining a
compact representation.Comment: Deep Vision 201
Particular object retrieval with integral max-pooling of CNN activations
Recently, image representation built upon Convolutional Neural Network (CNN)
has been shown to provide effective descriptors for image search, outperforming
pre-CNN features as short-vector representations. Yet such models are not
compatible with geometry-aware re-ranking methods and still outperformed, on
some particular object retrieval benchmarks, by traditional image search
systems relying on precise descriptor matching, geometric re-ranking, or query
expansion. This work revisits both retrieval stages, namely initial search and
re-ranking, by employing the same primitive information derived from the CNN.
We build compact feature vectors that encode several image regions without the
need to feed multiple inputs to the network. Furthermore, we extend integral
images to handle max-pooling on convolutional layer activations, allowing us to
efficiently localize matching objects. The resulting bounding box is finally
used for image re-ranking. As a result, this paper significantly improves
existing CNN-based recognition pipeline: We report for the first time results
competing with traditional methods on the challenging Oxford5k and Paris6k
datasets
- …