90,713 research outputs found

    Aggregating Deep Features For Image Retrieval

    Get PDF
    Measuring visual similarity between two images is useful in several multimedia applications such as visual search and image retrieval. However, measuring visual similarity between two images is an ill-posed problem which makes it a challenging task.This problem has been tackled extensively by the computer vision and machine learning communities. Nevertheless, with the recent advancements in deep learning, it is now possible to design novel image representations that allow systems to measure visual similarity more accurately than existing and widely adopted approaches, such as Fisher vectors. Unfortunately, deep-learning-based visual similarity approaches typically require post-processing stages that can be computationally expensive. To alleviate this issue, this thesis describes deep-learning-based visual image representations that allow a system to measure visual similarity without requiring post-processing stages. Specifically, this thesis describes max-pooling-based aggregation layers that combined with a convolutional-neural-network-based produce rich image representations for image retrieval without requiring an expensive post-processing stages. Moreover, the proposed max-pooling-based aggregation layers are general and can be seamlessly integrated with any existing and pre-trained networks. The experiments on large-scale image retrieval datasets confirm that the introduced image representations yield visual similarity measures that achieve a comparable or better retrieval performance than state-of-the art approaches, without requiring expensive post-processing operations

    Hierarchy-based Image Embeddings for Semantic Image Retrieval

    Full text link
    Deep neural networks trained for classification have been found to learn powerful image representations, which are also often used for other tasks such as comparing images w.r.t. their visual similarity. However, visual similarity does not imply semantic similarity. In order to learn semantically discriminative features, we propose to map images onto class embeddings whose pair-wise dot products correspond to a measure of semantic similarity between classes. Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e.g., novelty detection or few-shot learning. We introduce a deterministic algorithm for computing the class centroids directly based on prior world-knowledge encoded in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds, and ImageNet show that our learned semantic image embeddings improve the semantic consistency of image retrieval results by a large margin.Comment: Accepted at WACV 2019. Source code: https://github.com/cvjena/semantic-embedding

    Learning Non-Metric Visual Similarity for Image Retrieval

    Get PDF
    Measuring visual similarity between two or more instances within a data distribution is a fundamental task in image retrieval. Theoretically, non-metric distances are able to generate a more complex and accurate similarity model than metric distances, provided that the non-linear data distribution is precisely captured by the system. In this work, we explore neural networks models for learning a non-metric similarity function for instance search. We argue that non-metric similarity functions based on neural networks can build a better model of human visual perception than standard metric distances. As our proposed similarity function is differentiable, we explore a real end-to-end trainable approach for image retrieval, i.e. we learn the weights from the input image pixels to the final similarity score. Experimental evaluation shows that non-metric similarity networks are able to learn visual similarities between images and improve performance on top of state-of-the-art image representations, boosting results in standard image retrieval datasets with respect standard metric distances

    Instance-weighted Central Similarity for Multi-label Image Retrieval

    Full text link
    Deep hashing has been widely applied to large-scale image retrieval by encoding high-dimensional data points into binary codes for efficient retrieval. Compared with pairwise/triplet similarity based hash learning, central similarity based hashing can more efficiently capture the global data distribution. For multi-label image retrieval, however, previous methods only use multiple hash centers with equal weights to generate one centroid as the learning target, which ignores the relationship between the weights of hash centers and the proportion of instance regions in the image. To address the above issue, we propose a two-step alternative optimization approach, Instance-weighted Central Similarity (ICS), to automatically learn the center weight corresponding to a hash code. Firstly, we apply the maximum entropy regularizer to prevent one hash center from dominating the loss function, and compute the center weights via projection gradient descent. Secondly, we update neural network parameters by standard back-propagation with fixed center weights. More importantly, the learned center weights can well reflect the proportion of foreground instances in the image. Our method achieves the state-of-the-art performance on the image retrieval benchmarks, and especially improves the mAP by 1.6%-6.4% on the MS COCO dataset.Comment: 10 pages, 6 figure
    • …
    corecore