10,870 research outputs found
Mixture of Bilateral-Projection Two-dimensional Probabilistic Principal Component Analysis
The probabilistic principal component analysis (PPCA) is built upon a global
linear mapping, with which it is insufficient to model complex data variation.
This paper proposes a mixture of bilateral-projection probabilistic principal
component analysis model (mixB2DPPCA) on 2D data. With multi-components in the
mixture, this model can be seen as a soft cluster algorithm and has capability
of modeling data with complex structures. A Bayesian inference scheme has been
proposed based on the variational EM (Expectation-Maximization) approach for
learning model parameters. Experiments on some publicly available databases
show that the performance of mixB2DPPCA has been largely improved, resulting in
more accurate reconstruction errors and recognition rates than the existing
PCA-based algorithms
Log-Euclidean Bag of Words for Human Action Recognition
Representing videos by densely extracted local space-time features has
recently become a popular approach for analysing actions. In this paper, we
tackle the problem of categorising human actions by devising Bag of Words (BoW)
models based on covariance matrices of spatio-temporal features, with the
features formed from histograms of optical flow. Since covariance matrices form
a special type of Riemannian manifold, the space of Symmetric Positive Definite
(SPD) matrices, non-Euclidean geometry should be taken into account while
discriminating between covariance matrices. To this end, we propose to embed
SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW
approach to its Riemannian version. The proposed BoW approach takes into
account the manifold geometry of SPD matrices during the generation of the
codebook and histograms. Experiments on challenging human action datasets show
that the proposed method obtains notable improvements in discrimination
accuracy, in comparison to several state-of-the-art methods
EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis
Data clustering has received a lot of attention and numerous methods,
algorithms and software packages are available. Among these techniques,
parametric finite-mixture models play a central role due to their interesting
mathematical properties and to the existence of maximum-likelihood estimators
based on expectation-maximization (EM). In this paper we propose a new mixture
model that associates a weight with each observed point. We introduce the
weighted-data Gaussian mixture and we derive two EM algorithms. The first one
considers a fixed weight for each observation. The second one treats each
weight as a random variable following a gamma distribution. We propose a model
selection method based on a minimum message length criterion, provide a weight
initialization strategy, and validate the proposed algorithms by comparing them
with several state of the art parametric and non-parametric clustering
techniques. We also demonstrate the effectiveness and robustness of the
proposed clustering technique in the presence of heterogeneous data, namely
audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
We present a comparative evaluation of various techniques for action
recognition while keeping as many variables as possible controlled. We employ
two categories of Riemannian manifolds: symmetric positive definite matrices
and linear subspaces. For both categories we use their corresponding nearest
neighbour classifiers, kernels, and recent kernelised sparse representations.
We compare against traditional action recognition techniques based on Gaussian
mixture models and Fisher vectors (FVs). We evaluate these action recognition
techniques under ideal conditions, as well as their sensitivity in more
challenging conditions (variations in scale and translation). Despite recent
advancements for handling manifolds, manifold based techniques obtain the
lowest performance and their kernel representations are more unstable in the
presence of challenging conditions. The FV approach obtains the highest
accuracy under ideal conditions. Moreover, FV best deals with moderate scale
and translation changes
Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions
Head-pose estimation has many applications, such as social event analysis,
human-robot and human-computer interaction, driving assistance, and so forth.
Head-pose estimation is challenging because it must cope with changing
illumination conditions, variabilities in face orientation and in appearance,
partial occlusions of facial landmarks, as well as bounding-box-to-face
alignment errors. We propose tu use a mixture of linear regressions with
partially-latent output. This regression method learns to map high-dimensional
feature vectors (extracted from bounding boxes of faces) onto the joint space
of head-pose angles and bounding-box shifts, such that they are robustly
predicted in the presence of unobservable phenomena. We describe in detail the
mapping method that combines the merits of unsupervised manifold learning
techniques and of mixtures of regressions. We validate our method with three
publicly available datasets and we thoroughly benchmark four variants of the
proposed algorithm with several state-of-the-art head-pose estimation methods.Comment: 12 pages, 5 figures, 3 table
- …