8 research outputs found
Object Level Deep Feature Pooling for Compact Image Representation
Convolutional Neural Network (CNN) features have been successfully employed
in recent works as an image descriptor for various vision tasks. But the
inability of the deep CNN features to exhibit invariance to geometric
transformations and object compositions poses a great challenge for image
search. In this work, we demonstrate the effectiveness of the objectness prior
over the deep CNN features of image regions for obtaining an invariant image
representation. The proposed approach represents the image as a vector of
pooled CNN features describing the underlying objects. This representation
provides robustness to spatial layout of the objects in the scene and achieves
invariance to general geometric transformations, such as translation, rotation
and scaling. The proposed approach also leads to a compact representation of
the scene, making each image occupy a smaller memory footprint. Experiments
show that the proposed representation achieves state of the art retrieval
results on a set of challenging benchmark image datasets, while maintaining a
compact representation.Comment: Deep Vision 201
An accurate retrieval through R-MAC+ descriptors for landmark recognition
The landmark recognition problem is far from being solved, but with the use
of features extracted from intermediate layers of Convolutional Neural Networks
(CNNs), excellent results have been obtained. In this work, we propose some
improvements on the creation of R-MAC descriptors in order to make the
newly-proposed R-MAC+ descriptors more representative than the previous ones.
However, the main contribution of this paper is a novel retrieval technique,
that exploits the fine representativeness of the MAC descriptors of the
database images. Using this descriptors called "db regions" during the
retrieval stage, the performance is greatly improved. The proposed method is
tested on different public datasets: Oxford5k, Paris6k and Holidays. It
outperforms the state-of-the- art results on Holidays and reached excellent
results on Oxford5k and Paris6k, overcame only by approaches based on
fine-tuning strategies
A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval
The recent advances brought by deep learning allowed to improve the
performance in image retrieval tasks. Through the many convolutional layers,
available in a Convolutional Neural Network (CNN), it is possible to obtain a
hierarchy of features from the evaluated image. At every step, the patches
extracted are smaller than the previous levels and more representative.
Following this idea, this paper introduces a new detector applied on the
feature maps extracted from pre-trained CNN. Specifically, this approach lets
to increase the number of features in order to increase the performance of the
aggregation algorithms like the most famous and used VLAD embedding. The
proposed approach is tested on different public datasets: Holidays, Oxford5k,
Paris6k and UKB
Image Retrieval using Multi-scale CNN Features Pooling
In this paper, we address the problem of image retrieval by learning images
representation based on the activations of a Convolutional Neural Network. We
present an end-to-end trainable network architecture that exploits a novel
multi-scale local pooling based on NetVLAD and a triplet mining procedure based
on samples difficulty to obtain an effective image representation. Extensive
experiments show that our approach is able to reach state-of-the-art results on
three standard datasets.Comment: Accepted at ICMR 202
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
Deep Image Retrieval: A Survey
In recent years a vast amount of visual content has been generated and shared
from various fields, such as social media platforms, medical images, and
robotics. This abundance of content creation and sharing has introduced new
challenges. In particular, searching databases for similar content, i.e.content
based image retrieval (CBIR), is a long-established research area, and more
efficient and accurate methods are needed for real time retrieval. Artificial
intelligence has made progress in CBIR and has significantly facilitated the
process of intelligent search. In this survey we organize and review recent
CBIR works that are developed based on deep learning algorithms and techniques,
including insights and techniques from recent papers. We identify and present
the commonly-used benchmarks and evaluation methods used in the field. We
collect common challenges and propose promising future directions. More
specifically, we focus on image retrieval with deep learning and organize the
state of the art methods according to the types of deep network structure, deep
features, feature enhancement methods, and network fine-tuning strategies. Our
survey considers a wide variety of recent methods, aiming to promote a global
view of the field of instance-based CBIR.Comment: 20 pages, 11 figure