December 2024School of EngineeringFacial behavior analysis and recognition plays an important role in human-centered AI, boosting the technology development in the areas of human emotion recognition, attention detection and autonomous driving. This research performs 3D facial analysis, focusing on accurate 3D face and eye reconstruction, facial action recognition, and eye gaze estimation.The developments of deep learning models, combined with large benchmark datasets and representative 3D facial models, greatly improved the accuracy of 3D face and eye reconstruction. Despite this progress, existing methods in both areas still suffer from several significant limitations: a) lack of detailed shape modeling for accurately recovering subtle 3D facial motions and eyeball movement; b) over-dependence on a large amount of training data and labels; c) poor generalization across subjects and under different illuminations, distances and large head poses; and d) failure to effectively exploit physically plausible facial dynamics in videos. We introduce methods to address these limitations. For accurate 3D face reconstruction, we combine 3D facial models with Facial Action Unit (FAU) encoding system, where each AU represents a specific local facial motion driven by specific muscle activation.We first present a personalized 3D FAU blendshape learning framework together with a 3D face reconstruction model for recovering AU-interpretable 3D facial details. We also innovatively incorporate general knowledge of AU correlations into the learning process to reduce the amount of expression labels used in training. Our method not only produces a more personalized and detailed 3D face model but also yields improved facial action recognition. For 3D eye reconstruction, we create a deformable eye shape basis for representing detailed 3D eye structure. Different from existing approaches, we incorporate the 3D eye shape basis into a learning-based eye gaze estimation framework, inducing a geometry-based weak supervision in training the deep model. Our model is superior to others in terms of recovering 3D eye shape, eye rotation and gaze simultaneously from an image and is less dependent on full training labels while still maintaining the gaze accuracy. To further address the generalization and to exploit the facial dynamics for both facial actions and eye movement, we propose dynamic 3D face action and eye gaze tracking methods from monocular videos. The intuitive idea is based on the facial anatomy that all the facial motion components are activated by certain muscle contractions, so the reconstructed 3D motion should match with the physical laws of motion (Newton’s second law). Different from our frame-based method, we design different physically plausible models for facial action units and eyeball movement. For facial action units, we design a physics-informed model by constraining the reconstructed sequence to satisfy the underlying physics laws. For dynamic gaze tracking, we propose a physics-informed gaze tracking system by subjecting the eyeball movements to certain physical constrains and biomechanical laws. Furthermore, we propose to leverage human interactions and hand-eye coordination to reduce 3D eye gaze annotation using weakly supervised eye gaze tracking models. Our methods are evaluated against state-of-the-art methods both quantitatively and qualitatively, including 3D face reconstruction accuracy, facial action unit detection accuracy, and gaze estimation accuracy, both within and across datasets.Ph
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.