2,429 research outputs found
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Human face exhibits an inherent hierarchy in its representations (i.e.,
holistic facial expressions can be encoded via a set of facial action units
(AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown
great results in unsupervised extraction of hierarchical latent representations
from large amounts of image data, while being robust to noise and other
undesired artifacts. Potentially, this makes VAEs a suitable approach for
learning facial features for AU intensity estimation. Yet, most existing
VAE-based methods apply classifiers learned separately from the encoded
features. By contrast, the non-parametric (probabilistic) approaches, such as
Gaussian Processes (GPs), typically outperform their parametric counterparts,
but cannot deal easily with large amounts of data. To this end, we propose a
novel VAE semi-parametric modeling framework, named DeepCoder, which combines
the modeling power of parametric (convolutional) and nonparametric (ordinal
GPs) VAEs, for joint learning of (1) latent representations at multiple levels
in a task hierarchy1, and (2) classification of multiple ordinal outputs. We
show on benchmark datasets for AU intensity estimation that the proposed
DeepCoder outperforms the state-of-the-art approaches, and related VAEs and
deep learning models.Comment: ICCV 2017 - accepte
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Deep Gaussian Processes with Convolutional Kernels
Deep Gaussian processes (DGPs) provide a Bayesian non-parametric alternative
to standard parametric deep learning models. A DGP is formed by stacking
multiple GPs resulting in a well-regularized composition of functions. The
Bayesian framework that equips the model with attractive properties, such as
implicit capacity control and predictive uncertainty, makes it at the same time
challenging to combine with a convolutional structure. This has hindered the
application of DGPs in computer vision tasks, an area where deep parametric
models (i.e. CNNs) have made breakthroughs. Standard kernels used in DGPs such
as radial basis functions (RBFs) are insufficient for handling pixel
variability in raw images. In this paper, we build on the recent convolutional
GP to develop Convolutional DGP (CDGP) models which effectively capture image
level features through the use of convolution kernels, therefore opening up the
way for applying DGPs to computer vision tasks. Our model learns local spatial
influence and outperforms strong GP based baselines on multi-class image
classification. We also consider various constructions of convolution kernel
over the image patches, analyze the computational trade-offs and provide an
efficient framework for convolutional DGP models. The experimental results on
image data such as MNIST, rectangles-image, CIFAR10 and Caltech101 demonstrate
the effectiveness of the proposed approaches
- …