2,773 research outputs found

    Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

    Get PDF
    Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.Comment: 49 pages including appendi

    Graph Regularized Tensor Sparse Coding for Image Representation

    Full text link
    Sparse coding (SC) is an unsupervised learning scheme that has received an increasing amount of interests in recent years. However, conventional SC vectorizes the input images, which destructs the intrinsic spatial structures of the images. In this paper, we propose a novel graph regularized tensor sparse coding (GTSC) for image representation. GTSC preserves the local proximity of elementary structures in the image by adopting the newly proposed tubal-tensor representation. Simultaneously, it considers the intrinsic geometric properties by imposing graph regularization that has been successfully applied to uncover the geometric distribution for the image data. Moreover, the returned sparse representations by GTSC have better physical explanations as the key operation (i.e., circular convolution) in the tubal-tensor model preserves the shifting invariance property. Experimental results on image clustering demonstrate the effectiveness of the proposed scheme

    Lip syncing method for realistic expressive 3D face model

    Get PDF
    Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model. Β© 2017 Springer Science+Business Media New Yor

    Geometric Expression Invariant 3D Face Recognition using Statistical Discriminant Models

    No full text
    Currently there is no complete face recognition system that is invariant to all facial expressions. Although humans find it easy to identify and recognise faces regardless of changes in illumination, pose and expression, producing a computer system with a similar capability has proved to be particularly di cult. Three dimensional face models are geometric in nature and therefore have the advantage of being invariant to head pose and lighting. However they are still susceptible to facial expressions. This can be seen in the decrease in the recognition results using principal component analysis when expressions are added to a data set. In order to achieve expression-invariant face recognition systems, we have employed a tensor algebra framework to represent 3D face data with facial expressions in a parsimonious space. Face variation factors are organised in particular subject and facial expression modes. We manipulate this using single value decomposition on sub-tensors representing one variation mode. This framework possesses the ability to deal with the shortcomings of PCA in less constrained environments and still preserves the integrity of the 3D data. The results show improved recognition rates for faces and facial expressions, even recognising high intensity expressions that are not in the training datasets. We have determined, experimentally, a set of anatomical landmarks that best describe facial expression e ectively. We found that the best placement of landmarks to distinguish di erent facial expressions are in areas around the prominent features, such as the cheeks and eyebrows. Recognition results using landmark-based face recognition could be improved with better placement. We looked into the possibility of achieving expression-invariant face recognition by reconstructing and manipulating realistic facial expressions. We proposed a tensor-based statistical discriminant analysis method to reconstruct facial expressions and in particular to neutralise facial expressions. The results of the synthesised facial expressions are visually more realistic than facial expressions generated using conventional active shape modelling (ASM). We then used reconstructed neutral faces in the sub-tensor framework for recognition purposes. The recognition results showed slight improvement. Besides biometric recognition, this novel tensor-based synthesis approach could be used in computer games and real-time animation applications

    A Review on Facial Expression Recognition Techniques

    Get PDF
    Facial expression is in the topic of active research over the past few decades. Recognition and extracting various emotions and validating those emotions from the facial expression become very important in human computer interaction. Interpreting such human expression remains and much of the research is required about the way they relate to human affect. Apart from H-I interfaces other applications include awareness system, medical diagnosis, surveillance, law enforcement, automated tutoring system and many more. In the recent year different technique have been put forward for developing automated facial expression recognition system. This paper present quick survey on some of the facial expression recognition techniques. A comparative study is carried out using various feature extraction techniques. We define taxonomy of the field and cover all the steps from face detection to facial expression classification

    Statistical modelling for facial expression dynamics

    Get PDF
    PhDOne of the most powerful and fastest means of relaying emotions between humans are facial expressions. The ability to capture, understand and mimic those emotions and their underlying dynamics in the synthetic counterpart is a challenging task because of the complexity of human emotions, different ways of conveying them, non-linearities caused by facial feature and head motion, and the ever critical eye of the viewer. This thesis sets out to address some of the limitations of existing techniques by investigating three components of expression modelling and parameterisation framework: (1) Feature and expression manifold representation, (2) Pose estimation, and (3) Expression dynamics modelling and their parameterisation for the purpose of driving a synthetic head avatar. First, we introduce a hierarchical representation based on the Point Distribution Model (PDM). Holistic representations imply that non-linearities caused by the motion of facial features, and intrafeature correlations are implicitly embedded and hence have to be accounted for in the resulting expression space. Also such representations require large training datasets to account for all possible variations. To address those shortcomings, and to provide a basis for learning more subtle, localised variations, our representation consists of tree-like structure where a holistic root component is decomposed into leaves containing the jaw outline, each of the eye and eyebrows and the mouth. Each of the hierarchical components is modelled according to its intrinsic functionality, rather than the final, holistic expression label. Secondly, we introduce a statistical approach for capturing an underlying low-dimension expression manifold by utilising components of the previously defined hierarchical representation. As Principal Component Analysis (PCA) based approaches cannot reliably capture variations caused by large facial feature changes because of its linear nature, the underlying dynamics manifold for each of the hierarchical components is modelled using a Hierarchical Latent Variable Model (HLVM) approach. Whilst retaining PCA properties, such a model introduces a probability density model which can deal with missing or incomplete data and allows discovery of internal within cluster structures. All of the model parameters and underlying density model are automatically estimated during the training stage. We investigate the usefulness of such a model to larger and unseen datasets. Thirdly, we extend the concept of HLVM model to pose estimation to address the non-linear shape deformations and definition of the plausible pose space caused by large head motion. Since our head rarely stays still, and its movements are intrinsically connected with the way we perceive and understand the expressions, pose information is an integral part of their dynamics. The proposed 3 approach integrates into our existing hierarchical representation model. It is learned using sparse and discreetly sampled training dataset, and generalises to a larger and continuous view-sphere. Finally, we introduce a framework that models and extracts expression dynamics. In existing frameworks, explicit definition of expression intensity and pose information, is often overlooked, although usually implicitly embedded in the underlying representation. We investigate modelling of the expression dynamics based on use of static information only, and focus on its sufficiency for the task at hand. We compare a rule-based method that utilises the existing latent structure and provides a fusion of different components with holistic and Bayesian Network (BN) approaches. An Active Appearance Model (AAM) based tracker is used to extract relevant information from input sequences. Such information is subsequently used to define the parametric structure of the underlying expression dynamics. We demonstrate that such information can be utilised to animate a synthetic head avatar. Submitte
    • …
    corecore