2,051 research outputs found
Revisiting the Fisher vector for fine-grained classification
International audienceThis paper describes the joint submission of Inria and Xerox to their joint participation to the FGCOMP'2013 challenge. Although the proposed system follows most of the standard Fisher classification pipeline, we describe a few key features and good practices that significantly improve the accuracy when specifically considering fine-grain classification tasks. In particular, we consider the late fusion of two systems both based on Fisher vectors, but for which we choose drastically design choices that make them very complementary. Moreover, we propose a simple yet effective filtering strategy, which significantly boosts the performance for several class domains
DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition
Being symmetric positive-definite (SPD), covariance matrix has traditionally
been used to represent a set of local descriptors in visual recognition. Recent
study shows that kernel matrix can give considerably better representation by
modelling the nonlinearity in the local descriptor set. Nevertheless, neither
the descriptors nor the kernel matrix is deeply learned. Worse, they are
considered separately, hindering the pursuit of an optimal SPD representation.
This work proposes a deep network that jointly learns local descriptors,
kernel-matrix-based SPD representation, and the classifier via an end-to-end
training process. We derive the derivatives for the mapping from a local
descriptor set to the SPD representation to carry out backpropagation. Also, we
exploit the Daleckii-Krein formula in operator theory to give a concise and
unified result on differentiating SPD matrix functions, including the matrix
logarithm to handle the Riemannian geometry of kernel matrix. Experiments not
only show the superiority of kernel-matrix-based SPD representation with deep
local descriptors, but also verify the advantage of the proposed deep network
in pursuing better SPD representations for fine-grained image recognition
tasks
Cross-dimensional Weighting for Aggregated Deep Convolutional Features
We propose a simple and straightforward way of creating powerful image
representations via cross-dimensional weighting and aggregation of deep
convolutional neural network layer outputs. We first present a generalized
framework that encompasses a broad family of approaches and includes
cross-dimensional pooling and weighting steps. We then propose specific
non-parametric schemes for both spatial- and channel-wise weighting that boost
the effect of highly active spatial responses and at the same time regulate
burstiness effects. We experiment on different public datasets for image search
and show that our approach outperforms the current state-of-the-art for
approaches based on pre-trained networks. We also provide an easy-to-use, open
source implementation that reproduces our results.Comment: Accepted for publications at the 4th Workshop on Web-scale Vision and
Social Media (VSM), ECCV 201
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
- …