1,103 research outputs found

    DAA: A Delta Age AdaIN operation for age estimation via binary code transformer

    Full text link
    Naked eye recognition of age is usually based on comparison with the age of others. However, this idea is ignored by computer tasks because it is difficult to obtain representative contrast images of each age. Inspired by the transfer learning, we designed the Delta Age AdaIN (DAA) operation to obtain the feature difference with each age, which obtains the style map of each age through the learned values representing the mean and standard deviation. We let the input of transfer learning as the binary code of age natural number to obtain continuous age feature information. The learned two groups of values in Binary code mapping are corresponding to the mean and standard deviation of the comparison ages. In summary, our method consists of four parts: FaceEncoder, DAA operation, Binary code mapping, and AgeDecoder modules. After getting the delta age via AgeDecoder, we take the average value of all comparison ages and delta ages as the predicted age. Compared with state-of-the-art methods, our method achieves better performance with fewer parameters on multiple facial age datasets.Comment: Accepted by CVPR2023; 8 pages, 3 figure

    Beyond PCA: Deep Learning Approaches for Face Modeling and Aging

    Get PDF
    Modeling faces with large variations has been a challenging task in computer vision. These variations such as expressions, poses and occlusions are usually complex and non-linear. Moreover, new facial images also come with their own characteristic artifacts greatly diverse. Therefore, a good face modeling approach needs to be carefully designed for flexibly adapting to these challenging issues. Recently, Deep Learning approach has gained significant attention as one of the emerging research topics in both higher-level representation of data and the distribution of observations. Thanks to the nonlinear structure of deep learning models and the strength of latent variables organized in hidden layers, it can efficiently capture variations and structures in complex data. Inspired by this motivation, we present two novel approaches, i.e. Deep Appearance Models (DAM) and Robust Deep Appearance Models (RDAM), to accurately capture both shape and texture of face images under large variations. In DAM, three crucial components represented in hierarchical layers are modeled using Deep Boltzmann Machines (DBM) to robustly capture the variations of facial shapes and appearances. DAM has shown its potential in inferencing a representation for new face images under various challenging conditions. An improved version of DAM, named Robust DAM (RDAM), is also introduced to better handle the occluded face areas and, therefore, produces more plausible reconstruction results. These proposed approaches are evaluated in various applications to demonstrate their robustness and capabilities, e.g. facial super-resolution reconstruction, facial off-angle reconstruction, facial occlusion removal and age estimation using challenging face databases: Labeled Face Parts in the Wild (LFPW), Helen and FG-NET. Comparing to classical and other deep learning based approaches, the proposed DAM and RDAM achieve competitive results in those applications, thus this showed their advantages in handling occlusions, facial representation, and reconstruction. In addition to DAM and RDAM that are mainly used for modeling single facial image, the second part of the thesis focuses on novel deep models, i.e. Temporal Restricted Boltzmann Machines (TRBM) and tractable Temporal Non-volume Preserving (TNVP) approaches, to further model face sequences. By exploiting the additional temporal relationships presented in sequence data, the proposed models have their advantages in predicting the future of a sequence from its past. In the application of face age progression, age regression, and age-invariant face recognition, these models have shown their potential not only in efficiently capturing the non-linear age related variance but also producing a smooth synthesis in age progression across faces. Moreover, the structure of TNVP can be transformed into a deep convolutional network while keeping the advantages of probabilistic models with tractable log-likelihood density estimation. The proposed approach is evaluated in terms of synthesizing age-progressed faces and cross-age face verification. It consistently shows the state-of-the-art results in various face aging databases, i.e. FG-NET, MORPH, our collected large-scale aging database named AginG Faces in the Wild (AGFW), and Cross-Age Celebrity Dataset (CACD). A large-scale face verification on Megaface challenge 1 is also performed to further show the advantages of our proposed approach

    A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition

    Full text link
    Face recognition has attracted increasing attention due to its wide range of applications, but it is still challenging when facing large variations in the biometric data characteristics. Lenslet light field cameras have recently come into prominence to capture rich spatio-angular information, thus offering new possibilities for advanced biometric recognition systems. This paper proposes a double-deep spatio-angular learning framework for light field based face recognition, which is able to learn both texture and angular dynamics in sequence using convolutional representations; this is a novel recognition framework that has never been proposed before for either face recognition or any other visual recognition task. The proposed double-deep learning framework includes a long short-term memory (LSTM) recurrent network whose inputs are VGG-Face descriptions that are computed using a VGG-Very-Deep-16 convolutional neural network (CNN). The VGG-16 network uses different face viewpoints rendered from a full light field image, which are organised as a pseudo-video sequence. A comprehensive set of experiments has been conducted with the IST-EURECOM light field face database, for varied and challenging recognition tasks. Results show that the proposed framework achieves superior face recognition performance when compared to the state-of-the-art.Comment: Submitted to IEEE Transactions on Circuits and Systems for Video Technolog
    • …
    corecore