3,278 research outputs found

    End-to-end 3D face reconstruction with deep neural networks

    Full text link
    Monocular 3D facial shape reconstruction from a single 2D facial image has been an active research area due to its wide applications. Inspired by the success of deep neural networks (DNN), we propose a DNN-based approach for End-to-End 3D FAce Reconstruction (UH-E2FAR) from a single 2D image. Different from recent works that reconstruct and refine the 3D face in an iterative manner using both an RGB image and an initial 3D facial shape rendering, our DNN model is end-to-end, and thus the complicated 3D rendering process can be avoided. Moreover, we integrate in the DNN architecture two components, namely a multi-task loss function and a fusion convolutional neural network (CNN) to improve facial expression reconstruction. With the multi-task loss function, 3D face reconstruction is divided into neutral 3D facial shape reconstruction and expressive 3D facial shape reconstruction. The neutral 3D facial shape is class-specific. Therefore, higher layer features are useful. In comparison, the expressive 3D facial shape favors lower or intermediate layer features. With the fusion-CNN, features from different intermediate layers are fused and transformed for predicting the 3D expressive facial shape. Through extensive experiments, we demonstrate the superiority of our end-to-end framework in improving the accuracy of 3D face reconstruction.Comment: Accepted to CVPR1

    Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories

    Get PDF
    In this paper, we propose a new approach for facial expression recognition using deep covariance descriptors. The solution is based on the idea of encoding local and global Deep Convolutional Neural Network (DCNN) features extracted from still images, in compact local and global covariance descriptors. The space geometry of the covariance matrices is that of Symmetric Positive Definite (SPD) matrices. By conducting the classification of static facial expressions using Support Vector Machine (SVM) with a valid Gaussian kernel on the SPD manifold, we show that deep covariance descriptors are more effective than the standard classification with fully connected layers and softmax. Besides, we propose a completely new and original solution to model the temporal dynamic of facial expressions as deep trajectories on the SPD manifold. As an extension of the classification pipeline of covariance descriptors, we apply SVM with valid positive definite kernels derived from global alignment for deep covariance trajectories classification. By performing extensive experiments on the Oulu-CASIA, CK+, and SFEW datasets, we show that both the proposed static and dynamic approaches achieve state-of-the-art performance for facial expression recognition outperforming many recent approaches.Comment: A preliminary version of this work appeared in "Otberdout N, Kacem A, Daoudi M, Ballihi L, Berretti S. Deep Covariance Descriptors for Facial Expression Recognition, in British Machine Vision Conference 2018, BMVC 2018, Northumbria University, Newcastle, UK, September 3-6, 2018. ; 2018 :159." arXiv admin note: substantial text overlap with arXiv:1805.0386

    Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

    Full text link
    Cinemagraphs are a compelling way to convey dynamic aspects of a scene. In these media, dynamic and still elements are juxtaposed to create an artistic and narrative experience. Creating a high-quality, aesthetically pleasing cinemagraph requires isolating objects in a semantically meaningful way and then selecting good start times and looping periods for those objects to minimize visual artifacts (such a tearing). To achieve this, we present a new technique that uses object recognition and semantic segmentation as part of an optimization method to automatically create cinemagraphs from videos that are both visually appealing and semantically meaningful. Given a scene with multiple objects, there are many cinemagraphs one could create. Our method evaluates these multiple candidates and presents the best one, as determined by a model trained to predict human preferences in a collaborative way. We demonstrate the effectiveness of our approach with multiple results and a user study.Comment: To appear in ICCV 2017. Total 17 pages including the supplementary materia

    Parameterization of point-cloud freeform surfaces using adaptive sequential learning RBFnetworks

    Get PDF
    We propose a self-organizing Radial Basis Function (RBF) neural network method for parameterization of freeform surfaces from larger, noisy and unoriented point clouds. In particular, an adaptive sequential learning algorithm is presented for network construction from a single instance of point set. The adaptive learning allows neurons to be dynamically inserted and fully adjusted (e.g. their locations, widths and weights), according to mapping residuals and data point novelty associated to underlying geometry. Pseudo-neurons, exhibiting very limited contributions, can be removed through a pruning procedure. Additionally, a neighborhood extended Kalman filter (NEKF) was developed to significantly accelerate parameterization. Experimental results show that this adaptive learning enables effective capture of global low-frequency variations while preserving sharp local details, ultimately leading to accurate and compact parameterization, as characterized by a small number of neurons. Parameterization using the proposed RBF network provides simple, low cost and low storage solutions to many problems such as surface construction, re-sampling, hole filling, multiple level-of-detail meshing and data compression from unstructured and incomplete range data. Performance results are also presented for comparison

    Graph Embedded Extreme Learning Machine

    Get PDF

    Multi-View Face Recognition From Single RGBD Models of the Faces

    Get PDF
    This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks

    Adaptive 3D facial action intensity estimation and emotion recognition

    Get PDF
    Automatic recognition of facial emotion has been widely studied for various computer vision tasks (e.g. health monitoring, driver state surveillance and personalized learning). Most existing facial emotion recognition systems, however, either have not fully considered subject-independent dynamic features or were limited to 2D models, thus are not robust enough for real-life recognition tasks with subject variation, head movement and illumination change. Moreover, there is also lack of systematic research on effective newly arrived novel emotion class detection. To address these challenges, we present a real-time 3D facial Action Unit (AU) intensity estimation and emotion recognition system. It automatically selects 16 motion-based facial feature sets using minimal-redundancy–maximal-relevance criterion based optimization and estimates the intensities of 16 diagnostic AUs using feedforward Neural Networks and Support Vector Regressors. We also propose a set of six novel adaptive ensemble classifiers for robust classification of the six basic emotions and the detection of newly arrived unseen novel emotion classes (emotions that are not included in the training set). A distance-based clustering and uncertainty measures of the base classifiers within each ensemble model are used to inform the novel class detection. Evaluated with the Bosphorus 3D database, the system has achieved the best performance of 0.071 overall Mean Squared Error (MSE) for AU intensity estimation using Support Vector Regressors, and 92.2% average accuracy for the recognition of the six basic emotions using the proposed ensemble classifiers. In comparison with other related work, our research outperforms other state-of-the-art research on 3D facial emotion recognition for the Bosphorus database. Moreover, in on-line real-time evaluation with real human subjects, the proposed system also shows superior real-time performance with 84% recognition accuracy and great flexibility and adaptation for newly arrived novel (e.g. ‘contempt’ which is not included in the six basic emotions) emotion detection
    corecore