32,024 research outputs found
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Human face exhibits an inherent hierarchy in its representations (i.e.,
holistic facial expressions can be encoded via a set of facial action units
(AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown
great results in unsupervised extraction of hierarchical latent representations
from large amounts of image data, while being robust to noise and other
undesired artifacts. Potentially, this makes VAEs a suitable approach for
learning facial features for AU intensity estimation. Yet, most existing
VAE-based methods apply classifiers learned separately from the encoded
features. By contrast, the non-parametric (probabilistic) approaches, such as
Gaussian Processes (GPs), typically outperform their parametric counterparts,
but cannot deal easily with large amounts of data. To this end, we propose a
novel VAE semi-parametric modeling framework, named DeepCoder, which combines
the modeling power of parametric (convolutional) and nonparametric (ordinal
GPs) VAEs, for joint learning of (1) latent representations at multiple levels
in a task hierarchy1, and (2) classification of multiple ordinal outputs. We
show on benchmark datasets for AU intensity estimation that the proposed
DeepCoder outperforms the state-of-the-art approaches, and related VAEs and
deep learning models.Comment: ICCV 2017 - accepte
Single camera pose estimation using Bayesian filtering and Kinect motion priors
Traditional approaches to upper body pose estimation using monocular vision
rely on complex body models and a large variety of geometric constraints. We
argue that this is not ideal and somewhat inelegant as it results in large
processing burdens, and instead attempt to incorporate these constraints
through priors obtained directly from training data. A prior distribution
covering the probability of a human pose occurring is used to incorporate
likely human poses. This distribution is obtained offline, by fitting a
Gaussian mixture model to a large dataset of recorded human body poses, tracked
using a Kinect sensor. We combine this prior information with a random walk
transition model to obtain an upper body model, suitable for use within a
recursive Bayesian filtering framework. Our model can be viewed as a mixture of
discrete Ornstein-Uhlenbeck processes, in that states behave as random walks,
but drift towards a set of typically observed poses. This model is combined
with measurements of the human head and hand positions, using recursive
Bayesian estimation to incorporate temporal information. Measurements are
obtained using face detection and a simple skin colour hand detector, trained
using the detected face. The suggested model is designed with analytical
tractability in mind and we show that the pose tracking can be
Rao-Blackwellised using the mixture Kalman filter, allowing for computational
efficiency while still incorporating bio-mechanical properties of the upper
body. In addition, the use of the proposed upper body model allows reliable
three-dimensional pose estimates to be obtained indirectly for a number of
joints that are often difficult to detect using traditional object recognition
strategies. Comparisons with Kinect sensor results and the state of the art in
2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014
conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video:
https://www.youtube.com/watch?v=dJMTSo7-uF
- …