3,086 research outputs found
Compositional Model based Fisher Vector Coding for Image Classification
Deriving from the gradient vector of a generative model of local features,
Fisher vector coding (FVC) has been identified as an effective coding method
for image classification. Most, if not all, FVC implementations employ the
Gaussian mixture model (GMM) to depict the generation process of local
features. However, the representative power of the GMM could be limited because
it essentially assumes that local features can be characterized by a fixed
number of feature prototypes and the number of prototypes is usually small in
FVC. To handle this limitation, in this paper we break the convention which
assumes that a local feature is drawn from one of few Gaussian distributions.
Instead, we adopt a compositional mechanism which assumes that a local feature
is drawn from a Gaussian distribution whose mean vector is composed as the
linear combination of multiple key components and the combination weight is a
latent random variable. In this way, we can greatly enhance the representative
power of the generative model of FVC. To implement our idea, we designed two
particular generative models with such a compositional mechanism.Comment: Fixed typos. 16 pages. Appearing in IEEE T. Pattern Analysis and
Machine Intelligence (TPAMI
Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
We present a comparative evaluation of various techniques for action
recognition while keeping as many variables as possible controlled. We employ
two categories of Riemannian manifolds: symmetric positive definite matrices
and linear subspaces. For both categories we use their corresponding nearest
neighbour classifiers, kernels, and recent kernelised sparse representations.
We compare against traditional action recognition techniques based on Gaussian
mixture models and Fisher vectors (FVs). We evaluate these action recognition
techniques under ideal conditions, as well as their sensitivity in more
challenging conditions (variations in scale and translation). Despite recent
advancements for handling manifolds, manifold based techniques obtain the
lowest performance and their kernel representations are more unstable in the
presence of challenging conditions. The FV approach obtains the highest
accuracy under ideal conditions. Moreover, FV best deals with moderate scale
and translation changes
Embedding based on function approximation for large scale image search
The objective of this paper is to design an embedding method that maps local
features describing an image (e.g. SIFT) to a higher dimensional representation
useful for the image retrieval problem. First, motivated by the relationship
between the linear approximation of a nonlinear function in high dimensional
space and the stateof-the-art feature representation used in image retrieval,
i.e., VLAD, we propose a new approach for the approximation. The embedded
vectors resulted by the function approximation process are then aggregated to
form a single representation for image retrieval. Second, in order to make the
proposed embedding method applicable to large scale problem, we further derive
its fast version in which the embedded vectors can be efficiently computed,
i.e., in the closed-form. We compare the proposed embedding methods with the
state of the art in the context of image search under various settings: when
the images are represented by medium length vectors, short vectors, or binary
vectors. The experimental results show that the proposed embedding methods
outperform existing the state of the art on the standard public image retrieval
benchmarks.Comment: Accepted to TPAMI 2017. The implementation and precomputed features
of the proposed F-FAemb are released at the following link:
http://tinyurl.com/F-FAem
- …