2,773 research outputs found
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Factor analysis aims to determine latent factors, or traits, which summarize
a given data set. Inter-battery factor analysis extends this notion to multiple
views of the data. In this paper we show how a nonlinear, nonparametric version
of these models can be recovered through the Gaussian process latent variable
model. This gives us a flexible formalism for multi-view learning where the
latent variables can be used both for exploratory purposes and for learning
representations that enable efficient inference for ambiguous estimation tasks.
Learning is performed in a Bayesian manner through the formulation of a
variational compression scheme which gives a rigorous lower bound on the log
likelihood. Our Bayesian framework provides strong regularization during
training, allowing the structure of the latent space to be determined
efficiently and automatically. We demonstrate this by producing the first (to
our knowledge) published results of learning from dozens of views, even when
data is scarce. We further show experimental results on several different types
of multi-view data sets and for different kinds of tasks, including exploratory
data analysis, generation, ambiguity modelling through latent priors and
classification.Comment: 49 pages including appendi
Graph Regularized Tensor Sparse Coding for Image Representation
Sparse coding (SC) is an unsupervised learning scheme that has received an
increasing amount of interests in recent years. However, conventional SC
vectorizes the input images, which destructs the intrinsic spatial structures
of the images. In this paper, we propose a novel graph regularized tensor
sparse coding (GTSC) for image representation. GTSC preserves the local
proximity of elementary structures in the image by adopting the newly proposed
tubal-tensor representation. Simultaneously, it considers the intrinsic
geometric properties by imposing graph regularization that has been
successfully applied to uncover the geometric distribution for the image data.
Moreover, the returned sparse representations by GTSC have better physical
explanations as the key operation (i.e., circular convolution) in the
tubal-tensor model preserves the shifting invariance property. Experimental
results on image clustering demonstrate the effectiveness of the proposed
scheme
Lip syncing method for realistic expressive 3D face model
Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchikβs wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model. Β© 2017 Springer Science+Business Media New Yor
Geometric Expression Invariant 3D Face Recognition using Statistical Discriminant Models
Currently there is no complete face recognition system that is invariant to all facial expressions.
Although humans find it easy to identify and recognise faces regardless of changes in illumination,
pose and expression, producing a computer system with a similar capability has proved to
be particularly di cult. Three dimensional face models are geometric in nature and therefore
have the advantage of being invariant to head pose and lighting. However they are still susceptible
to facial expressions. This can be seen in the decrease in the recognition results using
principal component analysis when expressions are added to a data set.
In order to achieve expression-invariant face recognition systems, we have employed a tensor
algebra framework to represent 3D face data with facial expressions in a parsimonious
space. Face variation factors are organised in particular subject and facial expression modes.
We manipulate this using single value decomposition on sub-tensors representing one variation
mode. This framework possesses the ability to deal with the shortcomings of PCA in less constrained
environments and still preserves the integrity of the 3D data. The results show improved
recognition rates for faces and facial expressions, even recognising high intensity expressions
that are not in the training datasets.
We have determined, experimentally, a set of anatomical landmarks that best describe facial
expression e ectively. We found that the best placement of landmarks to distinguish di erent
facial expressions are in areas around the prominent features, such as the cheeks and eyebrows.
Recognition results using landmark-based face recognition could be improved with better placement.
We looked into the possibility of achieving expression-invariant face recognition by reconstructing
and manipulating realistic facial expressions. We proposed a tensor-based statistical
discriminant analysis method to reconstruct facial expressions and in particular to neutralise
facial expressions. The results of the synthesised facial expressions are visually more realistic
than facial expressions generated using conventional active shape modelling (ASM). We
then used reconstructed neutral faces in the sub-tensor framework for recognition purposes.
The recognition results showed slight improvement. Besides biometric recognition, this novel
tensor-based synthesis approach could be used in computer games and real-time animation
applications
A Review on Facial Expression Recognition Techniques
Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification
Statistical modelling for facial expression dynamics
PhDOne of the most powerful and fastest means of relaying emotions between humans are facial expressions.
The ability to capture, understand and mimic those emotions and their underlying dynamics
in the synthetic counterpart is a challenging task because of the complexity of human emotions, different
ways of conveying them, non-linearities caused by facial feature and head motion, and the
ever critical eye of the viewer. This thesis sets out to address some of the limitations of existing
techniques by investigating three components of expression modelling and parameterisation framework:
(1) Feature and expression manifold representation, (2) Pose estimation, and (3) Expression
dynamics modelling and their parameterisation for the purpose of driving a synthetic head avatar.
First, we introduce a hierarchical representation based on the Point Distribution Model (PDM).
Holistic representations imply that non-linearities caused by the motion of facial features, and intrafeature
correlations are implicitly embedded and hence have to be accounted for in the resulting
expression space. Also such representations require large training datasets to account for all possible
variations. To address those shortcomings, and to provide a basis for learning more subtle, localised
variations, our representation consists of tree-like structure where a holistic root component is decomposed
into leaves containing the jaw outline, each of the eye and eyebrows and the mouth. Each
of the hierarchical components is modelled according to its intrinsic functionality, rather than the
final, holistic expression label.
Secondly, we introduce a statistical approach for capturing an underlying low-dimension expression
manifold by utilising components of the previously defined hierarchical representation. As
Principal Component Analysis (PCA) based approaches cannot reliably capture variations caused by
large facial feature changes because of its linear nature, the underlying dynamics manifold for each
of the hierarchical components is modelled using a Hierarchical Latent Variable Model (HLVM) approach.
Whilst retaining PCA properties, such a model introduces a probability density model which
can deal with missing or incomplete data and allows discovery of internal within cluster structures.
All of the model parameters and underlying density model are automatically estimated during the
training stage. We investigate the usefulness of such a model to larger and unseen datasets.
Thirdly, we extend the concept of HLVM model to pose estimation to address the non-linear
shape deformations and definition of the plausible pose space caused by large head motion. Since
our head rarely stays still, and its movements are intrinsically connected with the way we perceive
and understand the expressions, pose information is an integral part of their dynamics. The proposed
3
approach integrates into our existing hierarchical representation model. It is learned using sparse and
discreetly sampled training dataset, and generalises to a larger and continuous view-sphere.
Finally, we introduce a framework that models and extracts expression dynamics. In existing
frameworks, explicit definition of expression intensity and pose information, is often overlooked,
although usually implicitly embedded in the underlying representation. We investigate modelling
of the expression dynamics based on use of static information only, and focus on its sufficiency
for the task at hand. We compare a rule-based method that utilises the existing latent structure and
provides a fusion of different components with holistic and Bayesian Network (BN) approaches. An
Active Appearance Model (AAM) based tracker is used to extract relevant information from input
sequences. Such information is subsequently used to define the parametric structure of the underlying
expression dynamics. We demonstrate that such information can be utilised to animate a synthetic
head avatar.
Submitte
- β¦