    Reconhecimento de padrões em expressões faciais : algoritmos e aplicações

    Orientador: Hélio PedriniTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de emoções tem-se tornado um tópico relevante de pesquisa pela comunidade científica, uma vez que desempenha um papel essencial na melhoria contínua dos sistemas de interação humano-computador. Ele pode ser aplicado em diversas áreas, tais como medicina, entretenimento, vigilância, biometria, educação, redes sociais e computação afetiva. Há alguns desafios em aberto relacionados ao desenvolvimento de sistemas emocionais baseados em expressões faciais, como dados que refletem emoções mais espontâneas e cenários reais. Nesta tese de doutorado, apresentamos diferentes metodologias para o desenvolvimento de sistemas de reconhecimento de emoções baseado em expressões faciais, bem como sua aplicabilidade na resolução de outros problemas semelhantes. A primeira metodologia é apresentada para o reconhecimento de emoções em expressões faciais ocluídas baseada no Histograma da Transformada Census (CENTRIST). Expressões faciais ocluídas são reconstruídas usando a Análise Robusta de Componentes Principais (RPCA). A extração de características das expressões faciais é realizada pelo CENTRIST, bem como pelos Padrões Binários Locais (LBP), pela Codificação Local do Gradiente (LGC) e por uma extensão do LGC. O espaço de características gerado é reduzido aplicando-se a Análise de Componentes Principais (PCA) e a Análise Discriminante Linear (LDA). Os algoritmos K-Vizinhos mais Próximos (KNN) e Máquinas de Vetores de Suporte (SVM) são usados para classificação. O método alcançou taxas de acerto competitivas para expressões faciais ocluídas e não ocluídas. A segunda é proposta para o reconhecimento dinâmico de expressões faciais baseado em Ritmos Visuais (VR) e Imagens da História do Movimento (MHI), de modo que uma fusão de ambos descritores codifique informações de aparência, forma e movimento dos vídeos. Para extração das características, o Descritor Local de Weber (WLD), o CENTRIST, o Histograma de Gradientes Orientados (HOG) e a Matriz de Coocorrência em Nível de Cinza (GLCM) são empregados. A abordagem apresenta uma nova proposta para o reconhecimento dinâmico de expressões faciais e uma análise da relevância das partes faciais. A terceira é um método eficaz apresentado para o reconhecimento de emoções audiovisuais com base na fala e nas expressões faciais. A metodologia envolve uma rede neural híbrida para extrair características visuais e de áudio dos vídeos. Para extração de áudio, uma Rede Neural Convolucional (CNN) baseada no log-espectrograma de Mel é usada, enquanto uma CNN construída sobre a Transformada de Census é empregada para a extração das características visuais. Os atributos audiovisuais são reduzidos por PCA e LDA, então classificados por KNN, SVM, Regressão Logística (LR) e Gaussian Naïve Bayes (GNB). A abordagem obteve taxas de reconhecimento competitivas, especialmente em dados espontâneos. A penúltima investiga o problema de detectar a síndrome de Down a partir de fotografias. Um descritor geométrico é proposto para extrair características faciais. Experimentos realizados em uma base de dados pública mostram a eficácia da metodologia desenvolvida. A última metodologia trata do reconhecimento de síndromes genéticas em fotografias. O método visa extrair atributos faciais usando características de uma rede neural profunda e medidas antropométricas. Experimentos são realizados em uma base de dados pública, alcançando taxas de reconhecimento competitivasAbstract: Emotion recognition has become a relevant research topic by the scientific community, since it plays an essential role in the continuous improvement of human-computer interaction systems. It can be applied in various areas, for instance, medicine, entertainment, surveillance, biometrics, education, social networks, and affective computing. There are some open challenges related to the development of emotion systems based on facial expressions, such as data that reflect more spontaneous emotions and real scenarios. In this doctoral dissertation, we propose different methodologies to the development of emotion recognition systems based on facial expressions, as well as their applicability in the development of other similar problems. The first is an emotion recognition methodology for occluded facial expressions based on the Census Transform Histogram (CENTRIST). Occluded facial expressions are reconstructed using an algorithm based on Robust Principal Component Analysis (RPCA). Extraction of facial expression features is then performed by CENTRIST, as well as Local Binary Patterns (LBP), Local Gradient Coding (LGC), and an LGC extension. The generated feature space is reduced by applying Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for classification. This method reached competitive accuracy rates for occluded and non-occluded facial expressions. The second proposes a dynamic facial expression recognition based on Visual Rhythms (VR) and Motion History Images (MHI), such that a fusion of both encodes appearance, shape, and motion information of the video sequences. For feature extraction, Weber Local Descriptor (WLD), CENTRIST, Histogram of Oriented Gradients (HOG), and Gray-Level Co-occurrence Matrix (GLCM) are employed. This approach shows a new direction for performing dynamic facial expression recognition, and an analysis of the relevance of facial parts. The third is an effective method for audio-visual emotion recognition based on speech and facial expressions. The methodology involves a hybrid neural network to extract audio and visual features from videos. For audio extraction, a Convolutional Neural Network (CNN) based on log Mel-spectrogram is used, whereas a CNN built on Census Transform is employed for visual extraction. The audio and visual features are reduced by PCA and LDA, and classified through KNN, SVM, Logistic Regression (LR), and Gaussian Naïve Bayes (GNB). This approach achieves competitive recognition rates, especially in a spontaneous data set. The second last investigates the problem of detecting Down syndrome from photographs. A geometric descriptor is proposed to extract facial features. Experiments performed on a public data set show the effectiveness of the developed methodology. The last methodology is about recognizing genetic disorders in photos. This method focuses on extracting facial features using deep features and anthropometric measurements. Experiments are conducted on a public data set, achieving competitive recognition ratesDoutoradoCiência da ComputaçãoDoutora em Ciência da Computação140532/2019-6CNPQCAPE

    Interaction of HPA axis genetics and early life stress shapes emotion recognition in healthy adults

    Background: Early life stress (ELS) affects facial emotion recognition (FER), as well as the underlying brain network. However, there is considerable inter-individual variability in these ELS-caused alterations. As the hypothalamic-pituitary-adrenal (HPA) axis is assumed to mediate neural and behavioural sequelae of ELS, the genetic disposition towards HPA axis reactivity might explain differential vulnerabilities. Methods: An additive genetic profile score (GPS) of HPA axis reactivity was built from 6 SNPs in 3 HPA axisrelated genes (FKBP5, CRHR1, NR3C1). We studied two independent samples. As a proof of concept, GPS was tested as a predictor of cortisol increase to a psychosocial challenge (MIST) in a healthy community sample of n=40. For the main study, a sample of n=170 completed a video-based FER task and retrospectively reported ELS experiences in the Childhood Trauma Questionnaire (CTQ). Results: GPS positively predicted cortisol increase in the stress challenge over and above covariates. CTQ and genetic profile scores interacted to predict facial emotion recognition, such that ELS had a detrimental effect on emotion processing only in individuals with higher GPS. Post-hoc moderation analyses revealed that, while a less stress-responsive genetic profile was protective against ELS effects, individuals carrying a moderate to high GPS were affected by ELS in their ability to infer emotion from facial expressions. Discussion: These results suggest that a biologically informed genetic profile score can capture the genetic disposition to HPA axis reactivity and moderates the influence of early environmental factors on facial emotion recognition. Further research should investigate the neural mechanisms underlying this moderation. The GPS used here might prove a powerful tool for studying inter-individual differences in vulnerability to early life stress

    Family Relationship Analysis In Photos

    Family relationship analysis has many potential applications, ranging from homeland security through to image search and social activity analysis. In our work, we present five computational problems for family relationship analysis in face photos. Studying these challenging problems is important and useful for semantic image understanding and social context extraction. In our study, the familial traits are learned from pairs of salient local facial parts using discriminative approaches. It is motivated by human perception studies on kinship recognition and the existence of familial traits through genetic inheritance. Second, kinship verification is performed on a pair of faces by integrating the familial traits based on confidence measures. Then, the generation recognition and specific family relationship recognition are explored. Finally, the separation of family and non-family group photos is studied based on a decision that combines multiple pair-wise kinship detections. An image database consisting of both family and non-family group photos is collected, and labeled at different levels of details. Experiments are performed on the database for all five tasks, based on different representations of the facial parts. Preliminary results show that the proposed problems can be addressed with a reasonably good performance. Our encouraging results may inspire more effort from the computer vision and image processing research community

    Automatic Kinship Verification in Unconstrained Faces using Deep Learning

    Kinship verification has a number of applications such as organizing large collections of images and recognizing resemblances among humans. Identifying kinship relations has also garnered interest due to several potential applications in security and surveillance and organizing and tagging the enormous number of videos being uploaded on the Internet. This dissertation has a five-fold contribution where first, a study is conducted to gain insight into the kinship verification process used by humans. Besides this, two separate deep learning based methods are proposed to solve kinship verification in images and videos. Other contributions of this research include interlinking face verification with kinship verification and creation of two kinship databases to facilitate research in this field. WVU Kinship Database is created which consists of multiple images per subject to facilitate kinship verification research. Next, kinship video (KIVI) database of more than 500 individuals with variations due to illumination, pose, occlusion, ethnicity, and expression is collected for this research. It comprises a total of 355 true kin video pairs with over 250,000 still frames. In this dissertation, a human study is conducted to understand the capabilities of human mind and to identify the discriminatory areas of a face that facilitate kinship-cues. The visual stimuli presented to the participants determines their ability to recognize kin relationship using the whole face as well as specific facial regions. The effect of participant gender, age, and kin-relation pair of the stimulus is analyzed using quantitative measures such as accuracy, discriminability index d′, and perceptual information entropy. Next, utilizing the information obtained from the human study, a hierarchical Kinship Verification via Representation Learning (KVRL) framework is utilized to learn the representation of different face regions in an unsupervised manner. We propose a novel approach for feature representation termed as filtered contractive deep belief networks (fcDBN). The proposed feature representation encodes relational information present in images using filters and contractive regularization penalty. A compact representation of facial images of kin is extracted as the output from the learned model and a multi-layer neural network is utilized to verify the kin accurately. The results show that the proposed deep learning framework (KVRL-fcDBN) yields state-of-the-art kinship verification accuracy on the WVU Kinship database and on four existing benchmark datasets. Additionally, we propose a new deep learning framework for kinship verification in unconstrained videos using a novel Supervised Mixed Norm regularization Autoencoder (SMNAE). This new autoencoder formulation introduces class-specific sparsity in the weight matrix. The proposed three-stage SMNAE based kinship verification framework utilizes the learned spatio-temporal representation in the video frames for verifying kinship in a pair of videos. The effectiveness of the proposed framework is demonstrated on the KIVI database and six existing kinship databases. On the KIVI database, SMNAE yields videobased kinship verification accuracy of 83.18% which is at least 3.2% better than existing algorithms. The algorithm is also evaluated on six publicly available kinship databases and compared with best reported results. It is observed that the proposed SMNAE consistently yields best results on all the databases. Finally, we end by discussing the connections between face verification and kinship verification research. We explore the area of self-kinship which is age-invariant face recognition. Further, kinship information is used as a soft biometric modality to boost the performance of face verification via product of likelihood ratio and support vector machine based approaches. Using the proposed KVRL-fcDBN framework, an improvement of over 20% is observed in the performance of face verification. By addressing several problems of limited samples per kinship dataset, introducing real-world variations in unconstrained databases and designing two deep learning frameworks, this dissertation improves the understanding of kinship verification across humans and the performance of automated systems. The algorithms proposed in this research have been shown to outperform existing algorithms across six different kinship databases and has till date the best reported results in this field

    Result Oriented Based Face Recognition using Neural Network with Erosion and Dilation Technique

    It has been observed that many face recognition algorithms fail to recognize faces after plastic surgery and wearing the spec/glasses which are the new challenge to automatic face recognition. Face detection is one of the challenging problems in the image processing. This seminar, introduce a face detection and recognition system to detect (finds) faces from database of known people. To detect the face before trying to recognize it saves a lot of work, as only a restricted region of the image is analyzed, opposite to many algorithms which work considering the whole image. In This , we gives study on Face Recognition After Plastic Surgery (FRAPS )and after wearing the spec/glasses with careful analysis of the effects on face appearance and its challenges to face recognition. To address FRAPS and wearing the spec/glasses problem, an ensemble of An Optimize Wait Selection By Genetic Algorithm For Training Artificial Neural Network Based On Image Erosion and Dilution Technology. Furthermore, with our impressive results, we suggest that face detection should be paid more attend to. To address this problem, we also used Edge detection method to detect i/p image properly or effectively. With this Edge Detection also used genetic algorithm to optimize weight using artificial neural network (ANN)and save that ANN file to database .And use that ANN file to compare face recognition in future DOI: 10.17762/ijritcc2321-8169.16041