21,348 research outputs found

    Few but Informative Local Hash Code Matching for Image Retrieval

    Get PDF
    Content-based image retrieval (CBIR) aims to search for the most similar images from an extensive database to a given query content. Existing CBIR works either represent each image with a compact global feature vector or extract a large number of highly compressed low-dimensional local features, where each contains limited information. In this research study, we propose an expressive local feature extraction pipeline and a many-to-many local feature matching method for large-scale CBIR. Unlike existing local feature methods, which tend to extract large amounts of low-dimensional local features from each image, the proposed method models characteristic feature representations for each image, aiming to employ fewer but more expressive local features. For further improving the results, an end-to-end trainable hash encoding layer is used for extracting compact but informative codes from images. The proposed many-to-many local feature matching is then directly performed on the hash feature vectors from input images, leading to new state-of-the-art performance on several benchmark datasets

    Edge-Directed Invariant Shoeprint Image Retrieval

    Get PDF
    In this paper, we propose the use of image feature points for the classification of shoeprint images in a forensic setting. These feature points are quantified using wavelet maxima points extracted from a nonorthogonal wavelet decomposition of the shoeprint images. Wavelet transforms have been shown to be an effective analysis tool for image indexing, retrieval and characterization. This effectiveness is mainly attributed to the ability of the latter transforms to capture well the spatial information and visual features of the analyzed images using only few dominant subband coefficients. In this work, we propose the use of a nonorthogonal multiresolution representation to achieve shift-invariance. To reduce the content redundancy, we limit the feature space to wavelet maxima points. Such dimensionality reduction enables compact image representation while satisfying the requirements of the "information-preserving" rule. Based on the wavelet maxima representations, we suggest the use of a variance-weighted minimum distance measure as the similarity metric for image query, retrieval, and search purposes. As a result, each image is indexed by a vector in the wavelet maxima moment space. Finally, performance results are reported to illustrate the robustness of the extracted features in searching and retrieving of shoeprint images independently of position, size, orientation and image background

    Aggregated Deep Local Features for Remote Sensing Image Retrieval

    Get PDF
    Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal contributio

    Object Level Deep Feature Pooling for Compact Image Representation

    Full text link
    Convolutional Neural Network (CNN) features have been successfully employed in recent works as an image descriptor for various vision tasks. But the inability of the deep CNN features to exhibit invariance to geometric transformations and object compositions poses a great challenge for image search. In this work, we demonstrate the effectiveness of the objectness prior over the deep CNN features of image regions for obtaining an invariant image representation. The proposed approach represents the image as a vector of pooled CNN features describing the underlying objects. This representation provides robustness to spatial layout of the objects in the scene and achieves invariance to general geometric transformations, such as translation, rotation and scaling. The proposed approach also leads to a compact representation of the scene, making each image occupy a smaller memory footprint. Experiments show that the proposed representation achieves state of the art retrieval results on a set of challenging benchmark image datasets, while maintaining a compact representation.Comment: Deep Vision 201

    Particular object retrieval with integral max-pooling of CNN activations

    Get PDF
    Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations. Yet such models are not compatible with geometry-aware re-ranking methods and still outperformed, on some particular object retrieval benchmarks, by traditional image search systems relying on precise descriptor matching, geometric re-ranking, or query expansion. This work revisits both retrieval stages, namely initial search and re-ranking, by employing the same primitive information derived from the CNN. We build compact feature vectors that encode several image regions without the need to feed multiple inputs to the network. Furthermore, we extend integral images to handle max-pooling on convolutional layer activations, allowing us to efficiently localize matching objects. The resulting bounding box is finally used for image re-ranking. As a result, this paper significantly improves existing CNN-based recognition pipeline: We report for the first time results competing with traditional methods on the challenging Oxford5k and Paris6k datasets
    corecore