54,730 research outputs found
Class-Weighted Convolutional Features for Visual Instance Search
Image retrieval in realistic scenarios targets large dynamic datasets of
unlabeled images. In these cases, training or fine-tuning a model every time
new images are added to the database is neither efficient nor scalable.
Convolutional neural networks trained for image classification over large
datasets have been proven effective feature extractors for image retrieval. The
most successful approaches are based on encoding the activations of
convolutional layers, as they convey the image spatial information. In this
paper, we go beyond this spatial information and propose a local-aware encoding
of convolutional features based on semantic information predicted in the target
image. To this end, we obtain the most discriminative regions of an image using
Class Activation Maps (CAMs). CAMs are based on the knowledge contained in the
network and therefore, our approach, has the additional advantage of not
requiring external information. In addition, we use CAMs to generate object
proposals during an unsupervised re-ranking stage after a first fast search.
Our experiments on two public available datasets for instance retrieval,
Oxford5k and Paris6k, demonstrate the competitiveness of our approach
outperforming the current state-of-the-art when using off-the-shelf models
trained on ImageNet. The source code and model used in this paper are publicly
available at http://imatge-upc.github.io/retrieval-2017-cam/.Comment: To appear in the British Machine Vision Conference (BMVC), September
201
From Facial Parts Responses to Face Detection: A Deep Learning Approach
In this paper, we propose a novel deep convolutional network (DCN) that
achieves outstanding performance on FDDB, PASCAL Face, and AFW. Specifically,
our method achieves a high recall rate of 90.99% on the challenging FDDB
benchmark, outperforming the state-of-the-art method by a large margin of
2.91%. Importantly, we consider finding faces from a new perspective through
scoring facial parts responses by their spatial structure and arrangement. The
scoring mechanism is carefully formulated considering challenging cases where
faces are only partially visible. This consideration allows our network to
detect faces under severe occlusion and unconstrained pose variation, which are
the main difficulty and bottleneck of most existing face detection approaches.
We show that despite the use of DCN, our network can achieve practical runtime
speed.Comment: To appear in ICCV 201
- …