5,363 research outputs found
Cross-dimensional Weighting for Aggregated Deep Convolutional Features
We propose a simple and straightforward way of creating powerful image
representations via cross-dimensional weighting and aggregation of deep
convolutional neural network layer outputs. We first present a generalized
framework that encompasses a broad family of approaches and includes
cross-dimensional pooling and weighting steps. We then propose specific
non-parametric schemes for both spatial- and channel-wise weighting that boost
the effect of highly active spatial responses and at the same time regulate
burstiness effects. We experiment on different public datasets for image search
and show that our approach outperforms the current state-of-the-art for
approaches based on pre-trained networks. We also provide an easy-to-use, open
source implementation that reproduces our results.Comment: Accepted for publications at the 4th Workshop on Web-scale Vision and
Social Media (VSM), ECCV 201
Generalized Max Pooling
State-of-the-art patch-based image representations involve a pooling
operation that aggregates statistics computed from local descriptors. Standard
pooling operations include sum- and max-pooling. Sum-pooling lacks
discriminability because the resulting representation is strongly influenced by
frequent yet often uninformative descriptors, but only weakly influenced by
rare yet potentially highly-informative ones. Max-pooling equalizes the
influence of frequent and rare descriptors but is only applicable to
representations that rely on count statistics, such as the bag-of-visual-words
(BOV) and its soft- and sparse-coding extensions. We propose a novel pooling
mechanism that achieves the same effect as max-pooling but is applicable beyond
the BOV and especially to the state-of-the-art Fisher Vector -- hence the name
Generalized Max Pooling (GMP). It involves equalizing the similarity between
each patch and the pooled representation, which is shown to be equivalent to
re-weighting the per-patch statistics. We show on five public image
classification benchmarks that the proposed GMP can lead to significant
performance gains with respect to heuristic alternatives.Comment: (to appear) CVPR 2014 - IEEE Conference on Computer Vision & Pattern
Recognition (2014
- …