847 research outputs found

    Continuous regression: a functional regression approach to facial landmark tracking

    Get PDF
    Facial Landmark Tracking (Face Tracking) is a key step for many Face Analysis systems, such as Face Recognition, Facial Expression Recognition, or Age and Gender Recognition, among others. The goal of Facial Landmark Tracking is to locate a sparse set of points defining a facial shape in a video sequence. These typically include the mouth, the eyes, the contour, or the nose tip. The state of the art method for Face Tracking builds on Cascaded Regression, in which a set of linear regressors are used in a cascaded fashion, each receiving as input the output of the previous one, subsequently reducing the error with respect to the target locations. Despite its impressive results, Cascaded Regression suffers from several drawbacks, which are basically caused by the theoretical and practical implications of using Linear Regression. Under the context of Face Alignment, Linear Regression is used to predict shape displacements from image features through a linear mapping. This linear mapping is learnt through the typical least-squares problem, in which a set of random perturbations is given. This means that, each time a new regressor is to be trained, Cascaded Regression needs to generate perturbations and apply the sampling again. Moreover, existing solutions are not capable of incorporating incremental learning in real time. It is well-known that person-specific models perform better than generic ones, and thus the possibility of personalising generic models whilst tracking is ongoing is a desired property, yet to be addressed. This thesis proposes Continuous Regression, a Functional Regression solution to the least-squares problem, resulting in the first real-time incremental face tracker. Briefly speaking, Continuous Regression approximates the samples by an estimation based on a first-order Taylor expansion yielding a closed-form solution for the infinite set of shape displacements. This way, it is possible to model the space of shape displacements as a continuum, without the need of using complex bases. Further, this thesis introduces a novel measure that allows Continuous Regression to be extended to spaces of correlated variables. This novel solution is incorporated into the Cascaded Regression framework, and its computational benefits for training under different configurations are shown. Then, it presents an approach for incremental learning within Cascaded Regression, and shows its complexity allows for real-time implementation. To the best of my knowledge, this is the first incremental face tracker that is shown to operate in real-time. The tracker is tested in an extensive benchmark, attaining state of the art results, thanks to the incremental learning capabilities

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

    Automatic analysis of facial actions: a survey

    Get PDF
    As one of the most comprehensive and objective ways to describe facial expressions, the Facial Action Coding System (FACS) has recently received significant attention. Over the past 30 years, extensive research has been conducted by psychologists and neuroscientists on various aspects of facial expression analysis using FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions. Such an automated process can also potentially increase the reliability, precision and temporal resolution of coding. This paper provides a comprehensive survey of research into machine analysis of facial actions. We systematically review all components of such systems: pre-processing, feature extraction and machine coding of facial actions. In addition, the existing FACS-coded facial expression databases are summarised. Finally, challenges that have to be addressed to make automatic facial action analysis applicable in real-life situations are extensively discussed. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the future of machine recognition of facial actions: what are the challenges and opportunities that researchers in the field face

    AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time

    Full text link
    Accurate whole-body multi-person pose estimation and tracking is an important yet challenging topic in computer vision. To capture the subtle actions of humans for complex behavior analysis, whole-body pose estimation including the face, body, hand and foot is essential over conventional body-only pose estimation. In this paper, we present AlphaPose, a system that can perform accurate whole-body pose estimation and tracking jointly while running in realtime. To this end, we propose several new techniques: Symmetric Integral Keypoint Regression (SIKR) for fast and fine localization, Parametric Pose Non-Maximum-Suppression (P-NMS) for eliminating redundant human detections and Pose Aware Identity Embedding for jointly pose estimation and tracking. During training, we resort to Part-Guided Proposal Generator (PGPG) and multi-domain knowledge distillation to further improve the accuracy. Our method is able to localize whole-body keypoints accurately and tracks humans simultaneously given inaccurate bounding boxes and redundant detections. We show a significant improvement over current state-of-the-art methods in both speed and accuracy on COCO-wholebody, COCO, PoseTrack, and our proposed Halpe-FullBody pose estimation dataset. Our model, source codes and dataset are made publicly available at https://github.com/MVIG-SJTU/AlphaPose.Comment: Documents for AlphaPose, accepted to TPAM
    corecore