Search CORE

6 research outputs found

Recommended from our members

Learning Bases of Activity for Facial Expression Recognition.

Author: Cavallaro Andrea
Gunes Hatice
Sariyanidi Evangelos
Publication venue: IEEE Trans Image Process
Publication date: 01/04/2017
Field of study

The extraction of descriptive features from sequences of faces is a fundamental problem in facial expression analysis. Facial expressions are represented by psychologists as a combination of elementary movements known as action units: each movement is localised and its intensity is specified with a score that is small when the movement is subtle and large when the movement is pronounced. Inspired by this approach, we propose a novel data-driven feature extraction framework that represents facial expression variations as a linear combination of localised basis functions, whose coefficients are proportional to movement intensity. We show that the linear basis functions required by this framework can be obtained by training a sparse linear model with Gabor phase shifts computed from facial videos. The proposed framework addresses generalisation issues that are not addressed by existing learnt representations, and achieves, with the same learning parameters, state-of-the-art results in recognising both posed expressions and spontaneous micro-expressions. This performance is confirmed even when the data used to train the model differ from test data in terms of the intensity of facial movements and frame rate.The work of E. Sariyanidi and H. Gunes are partially supported by the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood under Grant EP/L00416X/1

Apollo (Cambridge)

Queen Mary Research Online

Spontaneous facial micro expression recognition and analysis using varying resolutions

Author: Sharma Pratikshya
Publication venue
Publication date: 01/09/2022
Field of study

Ulster University's Research Portal

Desarrollo de un sistema automático de análisis de expresiones faciales para la detección de la mentira en adultos utilizando técnicas de aprendizaje automático

Author: Zartha Suarez Natalia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Este completo estudio genera su base investigativa en 3 modelos los cuales estan citados y explicados con un alto indice de accuracy, su base metodologica promete resolver un claro indice de la relacion que existe entre las microexpresiones faciales y la verdad. logrando implementar así tecnología artificial de analisis profundo.Existen 7 tipos de expresiones faciales universales, las cuales son: enfado, disgusto, miedo, felicidad, tristeza, sorpresa y desprecio. Estas expresiones faciales son indiferentes a la raza o la cultura de las regiones del mundo. Estas expresiones pueden ser falsificadas y son los pequeños movimientos los que nos pueden decir si una expresión está siendo real o es una mentira. Estos pequeños movimientos se llaman microexpresiones faciales, los cuales ocurren entre 1/15 y 1/25 segundos y son imperceptibles al ojo humano. Este trabajo de grado tiene como objetivo reconocer las microexpresiones faciales mediante un modelo profundo de aprendizaje automático. Para este fin, se desarrollan 3 modelos cada uno para dos bases de datos de microexpresiones faciales SMIC (X. Li, T. Pfister, X. Huang, G. Zhao & M. Pietikäinen, 2013) y CASME II (Yan WJ, Li X, Wang SJ, Zhao G, Liu YJ, Chen YH & Fu X., 2014). El primer modelo implementado fue MicroExpSTCNN el cual fue propuesto por (S. P. Teja Reddy, S. Teja Karri, S. R. Dubey & S. Mukherjee, 2019) utilizando sobre las mismas bases de datos de microexpresiones faciales, este trabajo de grado logró obtener un accuracy mayor para ambas bases de datos (90 % para CASME II y 91.6 % para SMIC); que el reportado por la referencia, el cual fue de 87.80 % para la base de datos CASME II. El segundo modelo implementado fue un CNN 3D con data augmentation rotando las imágenes con cierto número de grados escogidos aleatoriamente, para este modelo se logró mejorar el acurracy para la base de datos CASME II (94.2 %). El tercer modelo se construyó con una CNN 2D temporal y una capa de LSTM, lo cual logró mejorar notablemente la predicción para ambas bases de datos de microexpresiones faciales, ya que tuvo en cuenta la característica temporal de los 18 frames. También se desarrolló una aplicación donde se creó el modelo de la red neuronal y se le cargaron los pesos entrenados previamente para ambas bases de datos de SMIC (X. Li, et al., 2013) y CASME II (Yan WJ, et al., 2014). Se usó el framework Flask para visualizar el video y mostrar la microexpresión facial que predice el modelo.There are 7 types of universal facial expressions, which are: anger, disgust, fear, happiness, sadness, surprise and contempt. These facial expressions are indifferent to the race or culture of the world regions. These expressions can be faked and it is the small movements that can tell us if an expression is being real or a lie. These small movements are called facial micro-expressions, which occur between 1/15 and 1/25 seconds and are imperceptible to the human eye. This degree work aims to recognize facial microexpressions using a deep machine learning model. For this purpose, 3 models each are developed for two databases of SMIC facial microexpressions (X. Li, T. Pfister, X. Huang, G. Zhao & M. Pietikäinen, 2013) and CASME II (Yan WJ, Li X, Wang SJ, Zhao G, Liu YJ, Chen YH & Fu X., 2014). The first model implemented was MicroExpSTCNN which was proposed by (SP Teja Reddy, S. Teja Karri, SR Dubey & S. Mukherjee, 2019) using the same databases of facial microexpressions, this degree work managed to obtain a higher accuracy for both databases (90% for CASME II and 91.6% for SMIC); than that reported by the reference, which was 87.80% for the CASME II database. The second model implemented was a CNN 3D with data augmentation rotating the images with a certain number of degrees chosen randomly, for this model it was possible to improve the acurracy for the CASME II database (94.2%). The third model was built with a temporal 2D CNN and an LSTM layer, which managed to significantly improve the prediction for both databases of facial microexpressions, since it took into account the temporal characteristic of the 18 frames. An application was also developed where the neural network model was created and the previously trained weights were loaded for both databases of SMIC (X. Li, et al., 2013) and CASME II (Yan WJ, et al., 2014). The Flask framework was used to visualize the video and show the facial microexpression predicted by the model.MaestríaMagíster en Ingeniería de Sistemas y ComputaciónContenido Lista de Tablas 12 Lista de Figuras 15 Notaciones 27 1. Introducción 29 1.1. Planteamiento del problema 30 1.1.1. Pregunta de Investigación 32 1.2 Justificación 33 1.2.1. Pertinencia 34 1.2.2. Viabilidad 34 1.2.3. Impacto 35 1.3. Objetivos de la investigación 36 1.3.1. Objetivo general 36 1.3.2. Objetivos específicos 36 2. Estado del Arte 37 3. Marco Teórico 48 3.1. Teoría de las emociones 48 3.1.1. Las 7 Emociones Universales 49 3.1.1.1. Enfado 50 3.1.1.2. Disgusto 50 3.1.1.3. Miedo 51 3.1.1.4. Felicidad 51 3.1.1.5. Tristeza 52 3.1.1.6. Sorpresa 53 3.1.1.7. Desprecio 53 3.2. Expresiones faciales 54 3.3. Micro expresiones faciales 55 3.4. Emoción espontánea 56 3.5. La mentira 56 3.6. Deep learning 59 3.6.1. Convolutional Neural Networks (CNNs) 59 3.6.1.1. Arquitecturas de Redes Neuronales Convolucionales CNN 62 3.6.1.1.1. Layer Patterns 64 3.6.1.1.2. Dropout 65 3.6.1.1.3. Max pooling 66 3.6.1.1.4. Data Augmentation 67 3.6.2. Redes Neuronales Convolucionales 3D CNN 67 3.6.3. Redes de memoria a corto/largo plazo LSTM 70 3.6.4. Redes residuales ResNet 73 3.6.5. Funciones de activación 74 3.6.5.1. Softmax 75 3.6.5.2. Función de perdida Cross Entropy 76 3.6.5.3. ReLu: Unidad lineal rectificada 76 3.6.6. Algoritmos de optimización 77 3.6.6.1. Adam 77 3.6.6.2. SGD 79 4. Materiales y Métodos 81 4.1. Preparación de los datos 81 4.1.1. Base de datos de microexpresiones faciales CASME II 82 4.1.2. Base de datos de microexpresiones faciales SMIC 82 4.2. Caja de herramientas 84 4.2.1. Google Colaboratory 84 4.2.2. Tensorflow 85 4.2.3. Keras 85 4.2.4. Sklearn 86 4.2.5. Flask 87 4.2.6. Anaconda 88 4.2.7. TensorBoard 89 4.2.8. OpenCV 91 4.3. Reconocimiento de expresiones faciales utilizando modelos profundos 92 4.3.1. PyEmotionRecognition 93 4.3.2. Landmarks 94 4.3.3. Reconocimiento de expresiones faciales con Keras 95 4.4. Evaluación de los modelos 97 4.4.1. F1 score 97 4.4.2. Accuracy 98 4.4.3. Precisión 99 4.4.4. Sensibilidad (recall) 100 4.4.5. Especificidad 101 4.4.6. Curva ROC 101 4.4.7. Matriz de Confusión 104 5. Resultados y Discusiones 105 5.1. Modelo convolucional MicroExpSTCNN 3D para el reconocimiento de microexpresiones faciales 105 5.1.1. Base de datos CASME II 108 5.1.1.1. Matriz de Confusión 112 5.1.1.2. Curva ROC y AUROC 114 5.1.2. Base de datos SMIC 115 5.1.2.1. Matriz de Confusión 117 5.1.2.2. Curva ROC y AUROC 119 5.2. Modelo convolucional 3D para el reconocimiento de microexpresiones faciales con data augmentation 120 5.2.1. Base de datos CASME II 124 5.2.1.1. Matriz de Confusión 129 5.2.1.2. Curva ROC y AUROC 131 5.2.2. Base de datos SMIC 132 5.2.2.1. Matriz de Confusión 136 5.2.2.2. Curva ROC y AUROC 138 5.3. Modelo temporal profundo para el reconocimiento de expresiones faciales 140 5.3.1. Base de datos CASME II 142 5.3.1.1. Matriz de Confusión 145 5.3.1.2. Curva ROC y AUROC 147 5.3.2. Base de datos SMIC 148 5.3.2.1. Matriz de Confusión 152 5.3.2.2. Curva ROC y AUROC 153 5.4. Evaluación cuantitativa de los modelos propuestos para diferentes métricas 154 5.5. Comparación con el estado del arte 159 5.6. Evaluación de la complejidad del modelo 162 5.7. Aplicación para el reconocimiento de microexpresiones faciales 163 5.7.1. Herramientas 167 5.7.2. Funcionalidades 167 5.7.3. Diagrama de Flujo 168 6. Conclusiones y Trabajos futuros 170 6.1. Conclusiones 170 6.2. Trabajos futuros 173 6.3. Difusión publicaciones 174 7. Referencias 175 Anexo a. Modelos Implementados en Google Colab 185 Modelo 1. Modelo convolucional MicroExpSTCNN 3D para el reconocimiento de microexpresiones faciales 185 Base de datos CASME II 185 Base de datos SMIC 196 Modelo 2. Modelo convolucional 3D para el reconocimiento de microexpresiones faciales con data augmentation 211 Base de datos CASME II 211 Base de datos SMIC 221 Modelo 3. Modelo temporal profundo para el reconocimiento de expresiones faciales 233 Base de datos CASME II 233 Base de datos SMIC 240 Anexo b. Artículo científico 251 Anexo c. Manual técnico Aplicación 27

Repositorio academico de la Universidad Tecnológica de Pereira

Conventional and Neural Architectures for Biometric Presentation Attack Detection

Author: Pan Shi
Publication venue
Publication date
Field of study

Facial biometrics, which enable an efficient and reliable method of person recognition, have been growing continuously as an active sub-area of computer vision. Automatic face recognition offers a natural and non-intrusive method for recognising users from their facial characteristics. However, facial recognition systems are vulnerable to presentation attacks (or spoofing attacks) when an attacker attempts to hide their true identity and masquerades as a valid user by misleading the biometric system. Thus, Facial Presentation Attack Detection (Facial PAD) (or facial antispoofing) techniques that aim to protect face recognition systems from such attacks, have been attracting more research attention in recent years. Various systems and algorithms have been proposed and evaluated. This thesis explores and compares some novel directions for detecting facial presentation attacks, including traditional features as well as approaches based on deep learning. In particular, different features encapsulating temporal information are developed and explored for describing the dynamic characteristics in presentation attacks. Hand-crafted features, deep neural architectures and their possible extensions are explored for their application in PAD. The proposed novel traditional features address the problem of modelling distinct representations of presentation attacks in the temporal domain and consider two possible branches: behaviour-level and texture-level temporal information. The behaviour-level feature is developed from a symbolic system that was widely used in psychological studies and automated emotion analysis. Other proposed traditional features aim to capture the distinct differences in image quality, shadings and skin reflections by using dynamic texture descriptors. This thesis then explores deep learning approaches using different pre-trained neural architectures with the aim of improving detection performance. In doing so, this thesis also explores visualisations of the internal representation of the networks to inform the further development of such approaches for improving performance and suggest possible new directions for future research. These directions include interpretable capability of deep learning approaches for PAD and a fully automatic system design capability in which the network architecture and parameters are determined by the available data. The interpretable capability can produce justifications for PAD decisions through both natural language and saliency map formats. Such systems can lead to further performance improvement through the use of an attention sub-network by learning from the justifications. Designing optimum deep neural architectures for PAD is still a complex problem that requires substantial effort from human experts. For this reason, the necessity of producing a system that can automatically design the neural architecture for a particular task is clear. A gradient-based neural architecture search algorithm is explored and extended through the development of different optimisation functions for designing the neural architectures for PAD automatically. These possible extensions of the deep learning approaches for PAD were evaluated using challenging benchmark datasets and the potential of the proposed approaches were demonstrated by comparing with the state-of-the-art techniques and published results. The proposed methods were evaluated and analysed using publicly available datasets. Results from the experiments demonstrate the usefulness of temporal information and the potential benefits of applying deep learning techniques for presentation attack detection. In particular, the use of explanations for improving usability and performance of deep learning PAD techniques and automatic techniques for the design of PAD neural architectures show considerable promise for future development

Kent Academic Repository

Deep human face analysis and modelling

Author: Storey Gary Lee
Publication venue
Publication date
Field of study

Human face appearance and motion play a significant role in creating the complex social environments of human civilisation. Humans possess the capacity to perform facial analysis and come to conclusion such as the identity of individuals, understanding emotional state and diagnosing diseases. The capacity though is not universal for the entire population, where there are medical conditions such prosopagnosia and autism which can directly affect facial analysis capabilities of individuals, while other facial analysis tasks require specific traits and training to perform well. This has lead to the research of facial analysis systems within the computer vision and machine learning fields over the previous decades, where the aim is to automate many facial analysis tasks to a level similar or surpassing humans. While breakthroughs have been made in certain tasks with the emergence of deep learning methods in the recent years, new state-of-the-art results have been achieved in many computer vision and machine learning tasks. Within this thesis an investigation into the use of deep learning based methods for facial analysis systems takes place, following a review of the literature specific facial analysis tasks, methods and challenges are found which form the basis for the research findings presented. The research presented within this thesis focuses on the tasks of face detection and facial symmetry analysis specifically for the medical condition facial palsy. Firstly an initial approach to face detection and symmetry analysis is proposed using a unified multi-task Faster R-CNN framework, this method presents good accuracy on the test data sets for both tasks but also demonstrates limitations from which the remaining chapters take their inspiration. Next the Integrated Deep Model is proposed for the tasks of face detection and landmark localisation, with specific focus on false positive face detection reduction which is crucial for accurate facial feature extraction in the medical applications studied within this thesis. Evaluation of the method on the Face Detection Dataset and Benchmark and Annotated Faces in-the-Wild benchmark data sets shows a significant increase of over 50% in precision against other state-of-the-art face detection methods, while retaining a high level of recall. The task of facial symmetry and facial palsy grading are the focus of the finals chapters where both geometry-based symmetry features and 3D CNNs are applied. It is found through evaluation that both methods have validity in the grading of facial palsy. The 3D CNNs are the most accurate with an F1 score of 0.88. 3D CNNs are also capable of recognising mouth motion for both those with and without facial palsy with an F1 score of 0.82

Northumbria Research Link