7 research outputs found
Rank Priors for Continuous Non-Linear Dimensionality Reduction
Non-linear dimensionality reduction methods are powerful techniques to deal with high-dimensional datasets. However, they often are susceptible to local minima and perform poorly when initialized far from the global optimum, even when the intrinsic dimensionality is known a priori. In this work we introduce a prior over the dimensionality of the latent space, and simultaneously optimize both the latent space and its intrinsic dimensionality. Ad-hoc initialization schemes are unnecessary with our approach; we initialize the latent space to the observation space and automatically infer the latent dimensionality using an optimization scheme that drops dimensions in a continuous fashion. We report results applying our prior to various tasks involving probabilistic non-linear dimensionality reduction, and show that our method can outperform graph-based dimensionality reduction techniques as well as previously suggested ad-hoc initialization strategies
Semi-Supervised Facial Animation Retargeting
This paper presents a system for facial animation retargeting that al- lows learning a high-quality mapping between motion capture data and arbitrary target characters. We address one of the main chal- lenges of existing example-based retargeting methods, the need for a large number of accurate training examples to define the corre- spondence between source and target expression spaces. We show that this number can be significantly reduced by leveraging the in- formation contained in unlabeled data, i.e. facial expressions in the source or target space without corresponding poses. In contrast to labeled samples that require time-consuming and error-prone manual character posing, unlabeled samples are easily obtained as frames of motion capture recordings or existing animations of the target character. Our system exploits this information by learning a shared latent space between motion capture and character param- eters in a semi-supervised manner. We show that this approach is resilient to noisy input and missing data and significantly improves retargeting accuracy. To demonstrate its applicability, we integrate our algorithm in a performance-driven facial animation system
Rank priors for continuous non-linear dimensionality reduction
Discovering the underlying low-dimensional latent structure in high-dimensional perceptual observations (e.g., images, video) can, in many cases, greatly improve performance in recognition and tracking. However, non-linear dimensionality reduction methods are often susceptible to local minima and perform poorly when initialized far from the global optimum, even when the intrinsic dimensionality is known a priori. In this work we introduce a prior over the dimensionality of the latent space that penalizes high dimensional spaces, and simultaneously optimize both the latent space and its intrinsic dimensionality in a continuous fashion. Ad-hoc initialization schemes are unnecessary with our approach; we initialize the latent space to the observation space and automatically infer the latent dimensionality. We report results applying our prior to various probabilistic non-linear dimensionality reduction tasks, and show that our method can outperform graph-based dimensionality reduction techniques as well as previously suggested initialization strategies. We demonstrate the effectiveness of our approach when tracking and classifying human motion
Recommended from our members
Automatic model construction with Gaussian processes
This thesis develops a method for automatically constructing, visualizing and describing
a large class of models, useful for forecasting and finding structure in domains such
as time series, geological formations, and physical dynamics. These models, based on
Gaussian processes, can capture many types of statistical structure, such as periodicity,
changepoints, additivity, and symmetries. Such structure can be encoded through kernels,
which have historically been hand-chosen by experts. We show how to automate
this task, creating a system that explores an open-ended space of models and reports
the structures discovered.
To automatically construct Gaussian process models, we search over sums and products
of kernels, maximizing the approximate marginal likelihood. We show how any
model in this class can be automatically decomposed into qualitatively different parts,
and how each component can be visualized and described through text. We combine
these results into a procedure that, given a dataset, automatically constructs a model
along with a detailed report containing plots and generated text that illustrate the
structure discovered in the data.
The introductory chapters contain a tutorial showing how to express many types of
structure through kernels, and how adding and multiplying different kernels combines
their properties. Examples also show how symmetric kernels can produce priors over
topological manifolds such as cylinders, toruses, and Möbius strips, as well as their
higher-dimensional generalizations.
This thesis also explores several extensions to Gaussian process models. First, building
on existing work that relates Gaussian processes and neural nets, we analyze natural
extensions of these models to deep kernels and deep Gaussian processes. Second, we examine
additive Gaussian processes, showing their relation to the regularization method
of dropout. Third, we combine Gaussian processes with the Dirichlet process to produce
the warped mixture model: a Bayesian clustering model having nonparametric cluster
shapes, and a corresponding latent space in which each cluster has an interpretable
parametric form.This work was supported by the National Sciences and Engineering Research
Council of Canada, the Cambridge Commonwealth Trust, Pembroke College, a grant
from the Engineering and Physical Sciences Research Council, and a grant from Google
Realtime Face Tracking and Animation
Capturing and processing human geometry, appearance, and motion is at the core of computer graphics, computer vision, and human-computer interaction. The high complexity of human geometry and motion dynamics, and the high sensitivity of the human visual system to variations and subtleties in faces and bodies make the 3D acquisition and reconstruction of humans in motion a challenging task. Digital humans are often created through a combination of 3D scanning, appearance acquisition, and motion capture, leading to stunning results in recent feature films. However, these methods typically require complex acquisition systems and substantial manual post-processing. As a result, creating and animating high-quality digital avatars entails long turn-around times and substantial production costs. Recent technological advances in RGB-D devices, such as Microsoft Kinect, brought new hopes for realtime, portable, and affordable systems allowing to capture facial expressions as well as hand and body motions. RGB-D devices typically capture an image and a depth map. This permits to formulate the motion tracking problem as a 2D/3D non-rigid registration of a deformable model to the input data. We introduce a novel face tracking algorithm that combines geometry and texture registration with pre-recorded animation priors in a single optimization. This led to unprecedented face tracking quality on a low cost consumer level device. The main drawback of this approach in the context of consumer applications is the need for an offline user-specific training. Robust and efficient tracking is achieved by building an accurate 3D expression model of the user's face who is scanned in a predefined set of facial expressions. We extended this approach removing the need of a user-specific training or calibration, or any other form of manual assistance, by modeling online a 3D user-specific dynamic face model. In complement of a realtime face tracking and modeling algorithm, we developed a novel system for animation retargeting that allows learning a high-quality mapping between motion capture data and arbitrary target characters. We addressed one of the main challenges of existing example-based retargeting methods, the need for a large number of accurate training examples to define the correspondence between source and target expression spaces. We showed that this number can be significantly reduced by leveraging the information contained in unlabeled data, i.e. facial expressions in the source or target space without corresponding poses. Finally, we present a novel realtime physics-based animation technique allowing to simulate a large range of deformable materials such as fat, flesh, hair, or muscles. This approach could be used to produce more lifelike animations by enhancing the animated avatars with secondary effects. We believe that the realtime face tracking and animation pipeline presented in this thesis has the potential to inspire numerous future research in the area of computer-generated animation. Already, several ideas presented in thesis have been successfully used in industry and this work gave birth to the startup company faceshift AG