1,226 research outputs found

    Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression

    Full text link
    We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial expressions, emotions and key-points for new individuals are difficult and costly to obtain. In this paper we improve the ability of techniques to generalize to new and unseen individuals by explicitly modeling previously seen variations related to identity and expression. We use a weakly-supervised approach in which identity labels are used to learn the different factors of variation linked to identity separately from factors related to expression. We show how probabilistic modeling of these sources of variation allows one to learn identity-invariant representations for expressions which can then be used to identity-normalize various procedures for facial expression analysis and animation control. We also show how to extend the widely used techniques of active appearance models and constrained local models through replacing the underlying point distribution models which are typically constructed using principal component analysis with identity-expression factorized representations. We present a wide variety of experiments in which we consistently improve performance on emotion recognition, markerless performance-driven facial animation and facial key-point tracking.Comment: to appear in Image and Vision Computing Journal (IMAVIS

    Relating Objective and Subjective Performance Measures for AAM-based Visual Speech Synthesizers

    Get PDF
    We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality

    Modelling Face Memory Reveals Task-generalizable Representations

    Get PDF
    Current cognitive theories are cast in terms of information-processing mechanisms that use mental representations. For example, people use their mental representations to identify familiar faces under various conditions of pose, illumination and ageing, or to draw resemblance between family members. Yet, the actual information contents of these representations are rarely characterized, which hinders knowledge of the mechanisms that use them. Here, we modelled the three-dimensional representational contents of 4 faces that were familiar to 14 participants as work colleagues. The representational contents were created by reverse-correlating identity information generated on each trial with judgements of the face’s similarity to the individual participant’s memory of this face. In a second study, testing new participants, we demonstrated the validity of the modelled contents using everyday face tasks that generalize identity judgements to new viewpoints, age and sex. Our work highlights that such models of mental representations are critical to understanding generalization behaviour and its underlying information-processing mechanisms

    Robust Face Alignment for Illumination and Pose Invariant Face Recognition

    Get PDF
    In building a face recognition system for real-life scenarios, one usually faces the problem that is the selection of a feature-space and preprocessing methods such as alignment under varying illumination conditions and poses. In this study, we developed a robust face alignment approach based on Active Appearance Model (AAM) by inserting an illumination normalization module into the standard AAM searching procedure and inserting different poses of the same identity into the training set. The modified AAM search can now handle both illumination and pose variations in the same epoch, hence it provides better convergence in both point-to-point and point-to-curve senses. We also investigate how face recognition performance is affected by the selection of feature space as well as the proposed alignment method. The experimental results show that the combined pose alignment and illumination normalization methods increase the recognition rates considerably for all featurespaces. 1

    Modeling small objects under uncertainties : novel algorithms and applications.

    Get PDF
    Active Shape Models (ASM), Active Appearance Models (AAM) and Active Tensor Models (ATM) are common approaches to model elastic (deformable) objects. These models require an ensemble of shapes and textures, annotated by human experts, in order identify the model order and parameters. A candidate object may be represented by a weighted sum of basis generated by an optimization process. These methods have been very effective for modeling deformable objects in biomedical imaging, biometrics, computer vision and graphics. They have been tried mainly on objects with known features that are amenable to manual (expert) annotation. They have not been examined on objects with severe ambiguities to be uniquely characterized by experts. This dissertation presents a unified approach for modeling, detecting, segmenting and categorizing small objects under uncertainty, with focus on lung nodules that may appear in low dose CT (LDCT) scans of the human chest. The AAM, ASM and the ATM approaches are used for the first time on this application. A new formulation to object detection by template matching, as an energy optimization, is introduced. Nine similarity measures of matching have been quantitatively evaluated for detecting nodules less than 1 em in diameter. Statistical methods that combine intensity, shape and spatial interaction are examined for segmentation of small size objects. Extensions of the intensity model using the linear combination of Gaussians (LCG) approach are introduced, in order to estimate the number of modes in the LCG equation. The classical maximum a posteriori (MAP) segmentation approach has been adapted to handle segmentation of small size lung nodules that are randomly located in the lung tissue. A novel empirical approach has been devised to simultaneously detect and segment the lung nodules in LDCT scans. The level sets methods approach was also applied for lung nodule segmentation. A new formulation for the energy function controlling the level set propagation has been introduced taking into account the specific properties of the nodules. Finally, a novel approach for classification of the segmented nodules into categories has been introduced. Geometric object descriptors such as the SIFT, AS 1FT, SURF and LBP have been used for feature extraction and matching of small size lung nodules; the LBP has been found to be the most robust. Categorization implies classification of detected and segmented objects into classes or types. The object descriptors have been deployed in the detection step for false positive reduction, and in the categorization stage to assign a class and type for the nodules. The AAMI ASMI A TM models have been used for the categorization stage. The front-end processes of lung nodule modeling, detection, segmentation and classification/categorization are model-based and data-driven. This dissertation is the first attempt in the literature at creating an entirely model-based approach for lung nodule analysis
    • …
    corecore