4,884 research outputs found
Graph-based classification of multiple observation sets
We consider the problem of classification of an object given multiple
observations that possibly include different transformations. The possible
transformations of the object generally span a low-dimensional manifold in the
original signal space. We propose to take advantage of this manifold structure
for the effective classification of the object represented by the observation
set. In particular, we design a low complexity solution that is able to exploit
the properties of the data manifolds with a graph-based algorithm. Hence, we
formulate the computation of the unknown label matrix as a smoothing process on
the manifold under the constraint that all observations represent an object of
one single class. It results into a discrete optimization problem, which can be
solved by an efficient and low complexity algorithm. We demonstrate the
performance of the proposed graph-based algorithm in the classification of sets
of multiple images. Moreover, we show its high potential in video-based face
recognition, where it outperforms state-of-the-art solutions that fall short of
exploiting the manifold structure of the face image data sets.Comment: New content adde
Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
We present a comparative evaluation of various techniques for action
recognition while keeping as many variables as possible controlled. We employ
two categories of Riemannian manifolds: symmetric positive definite matrices
and linear subspaces. For both categories we use their corresponding nearest
neighbour classifiers, kernels, and recent kernelised sparse representations.
We compare against traditional action recognition techniques based on Gaussian
mixture models and Fisher vectors (FVs). We evaluate these action recognition
techniques under ideal conditions, as well as their sensitivity in more
challenging conditions (variations in scale and translation). Despite recent
advancements for handling manifolds, manifold based techniques obtain the
lowest performance and their kernel representations are more unstable in the
presence of challenging conditions. The FV approach obtains the highest
accuracy under ideal conditions. Moreover, FV best deals with moderate scale
and translation changes
Generalized Rank Pooling for Activity Recognition
Most popular deep models for action recognition split video sequences into
short sub-sequences consisting of a few frames; frame-based features are then
pooled for recognizing the activity. Usually, this pooling step discards the
temporal order of the frames, which could otherwise be used for better
recognition. Towards this end, we propose a novel pooling method, generalized
rank pooling (GRP), that takes as input, features from the intermediate layers
of a CNN that is trained on tiny sub-sequences, and produces as output the
parameters of a subspace which (i) provides a low-rank approximation to the
features and (ii) preserves their temporal order. We propose to use these
parameters as a compact representation for the video sequence, which is then
used in a classification setup. We formulate an objective for computing this
subspace as a Riemannian optimization problem on the Grassmann manifold, and
propose an efficient conjugate gradient scheme for solving it. Experiments on
several activity recognition datasets show that our scheme leads to
state-of-the-art performance.Comment: Accepted at IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR), 201
- …