613 research outputs found

    Machine Analysis of Facial Expressions

    Get PDF
    No abstract

    Relative Facial Action Unit Detection

    Full text link
    This paper presents a subject-independent facial action unit (AU) detection method by introducing the concept of relative AU detection, for scenarios where the neutral face is not provided. We propose a new classification objective function which analyzes the temporal neighborhood of the current frame to decide if the expression recently increased, decreased or showed no change. This approach is a significant change from the conventional absolute method which decides about AU classification using the current frame, without an explicit comparison with its neighboring frames. Our proposed method improves robustness to individual differences such as face scale and shape, age-related wrinkles, and transitions among expressions (e.g., lower intensity of expressions). Our experiments on three publicly available datasets (Extended Cohn-Kanade (CK+), Bosphorus, and DISFA databases) show significant improvement of our approach over conventional absolute techniques. Keywords: facial action coding system (FACS); relative facial action unit detection; temporal information;Comment: Accepted at IEEE Winter Conference on Applications of Computer Vision, Steamboat Springs Colorado, USA, 201

    Out-of-plane action unit recognition using recurrent neural networks

    Get PDF
    A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015.The face is a fundamental tool to assist in interpersonal communication and interaction between people. Humans use facial expressions to consciously or subconsciously express their emotional states, such as anger or surprise. As humans, we are able to easily identify changes in facial expressions even in complicated scenarios, but the task of facial expression recognition and analysis is complex and challenging to a computer. The automatic analysis of facial expressions by computers has applications in several scientific subjects such as psychology, neurology, pain assessment, lie detection, intelligent environments, psychiatry, and emotion and paralinguistic communication. We look at methods of facial expression recognition, and in particular, the recognition of Facial Action Coding System’s (FACS) Action Units (AUs). Movements of individual muscles on the face are encoded by FACS from slightly different, instant changes in facial appearance. Contractions of specific facial muscles are related to a set of units called AUs. We make use of Speeded Up Robust Features (SURF) to extract keypoints from the face and use the SURF descriptors to create feature vectors. SURF provides smaller sized feature vectors than other commonly used feature extraction techniques. SURF is comparable to or outperforms other methods with respect to distinctiveness, robustness, and repeatability. It is also much faster than other feature detectors and descriptors. The SURF descriptor is scale and rotation invariant and is unaffected by small viewpoint changes or illumination changes. We use the SURF feature vectors to train a recurrent neural network (RNN) to recognize AUs from the Cohn-Kanade database. An RNN is able to handle temporal data received from image sequences in which an AU or combination of AUs are shown to develop from a neutral face. We are recognizing AUs as they provide a more fine-grained means of measurement that is independent of age, ethnicity, gender and different expression appearance. In addition to recognizing FACS AUs from the Cohn-Kanade database, we use our trained RNNs to recognize the development of pain in human subjects. We make use of the UNBC-McMaster pain database which contains image sequences of people experiencing pain. In some cases, the pain results in their face moving out-of-plane or some degree of in-plane movement. The temporal processing ability of RNNs can assist in classifying AUs where the face is occluded and not facing frontally for some part of the sequence. Results are promising when tested on the Cohn-Kanade database. We see higher overall recognition rates for upper face AUs than lower face AUs. Since keypoints are globally extracted from the face in our system, local feature extraction could provide improved recognition results in future work. We also see satisfactory recognition results when tested on samples with out-of-plane head movement, showing the temporal processing ability of RNNs

    “Magic mirror in my hand, what is the sentiment in the lens?”: An action unit based approach for mining sentiments from multimedia contents

    Get PDF
    In psychology and philosophy, emotion is a subjective, conscious experience characterized primarily by psychophysiological expressions, biological reactions, and mental states. Emotion could be also considered as a “positive or negative experience” that is associated with a particular pattern of physiological activity. So, the extraction and recognition of emotions from multimedia contents is becoming one of the most challenging research topics in human–computer interaction. Facial expressions, posture, gestures, speech, emotive changes of physical parameters (e.g. body temperature, blush and changes in the tone of the voice) can reflect changes in the user's emotional state and all this kind of parameters can be detected and interpreted by a computer leading to the so-called “affective computing”. In this paper an approach for the extraction of emotions from images and videos will be introduced. In particular, it involves the adoption of action units' extraction from facial expression according to the Ekman theory. The proposed approach has been tested on standard and real datasets with interesting and promising results

    An Efficient Boosted Classifier Tree-Based Feature Point Tracking System for Facial Expression Analysis

    Get PDF
    The study of facial movement and expression has been a prominent area of research since the early work of Charles Darwin. The Facial Action Coding System (FACS), developed by Paul Ekman, introduced the first universal method of coding and measuring facial movement. Human-Computer Interaction seeks to make human interaction with computer systems more effective, easier, safer, and more seamless. Facial expression recognition can be broken down into three distinctive subsections: Facial Feature Localization, Facial Action Recognition, and Facial Expression Classification. The first and most important stage in any facial expression analysis system is the localization of key facial features. Localization must be accurate and efficient to ensure reliable tracking and leave time for computation and comparisons to learned facial models while maintaining real-time performance. Two possible methods for localizing facial features are discussed in this dissertation. The Active Appearance Model is a statistical model describing an object\u27s parameters through the use of both shape and texture models, resulting in appearance. Statistical model-based training for object recognition takes multiple instances of the object class of interest, or positive samples, and multiple negative samples, i.e., images that do not contain objects of interest. Viola and Jones present a highly robust real-time face detection system, and a statistically boosted attentional detection cascade composed of many weak feature detectors. A basic algorithm for the elimination of unnecessary sub-frames while using Viola-Jones face detection is presented to further reduce image search time. A real-time emotion detection system is presented which is capable of identifying seven affective states (agreeing, concentrating, disagreeing, interested, thinking, unsure, and angry) from a near-infrared video stream. The Active Appearance Model is used to place 23 landmark points around key areas of the eyes, brows, and mouth. A prioritized binary decision tree then detects, based on the actions of these key points, if one of the seven emotional states occurs as frames pass. The completed system runs accurately and achieves a real-time frame rate of approximately 36 frames per second. A novel facial feature localization technique utilizing a nested cascade classifier tree is proposed. A coarse-to-fine search is performed in which the regions of interest are defined by the response of Haar-like features comprising the cascade classifiers. The individual responses of the Haar-like features are also used to activate finer-level searches. A specially cropped training set derived from the Cohn-Kanade AU-Coded database is also developed and tested. Extensions of this research include further testing to verify the novel facial feature localization technique presented for a full 26-point face model, and implementation of a real-time intensity sensitive automated Facial Action Coding System

    A dynamic texture based approach to recognition of facial actions and their temporal models

    Get PDF
    In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set

    Human Activity Recognition in a Car with Embedded Devices

    Get PDF
    Detection and prediction of drowsiness is key for the implementation of intelligent vehicles aimed to prevent highway crashes. There are several approaches for such solution.In thispaper the computer vision approach will be analysed, where embedded devices (e.g.videocameras) are used along with machine learning and pattern recognition techniques for implementing suitable solutions for detecting driver fatigue.Most of the research in computer vision systems focused on the analysis of blinks, this is a notable solution when it is combined with additional patterns like yawing or head motion for the recognition of drowsiness. The first step in this approach is the face recognition, where AdaBoost algorithm shows accurate results for the feature extraction, whereas regarding the detection of drowsiness the data-driven classifiers such as Support Vector Machine (SVM) yields remarkable results.One underlying component for implementing a computer vision technology for detection of drowsiness is a database of spontaneous images from the Facial Action Coding System (FACS), where the classifier can be trained accordingly.This paper introduces a straightforward prototype for detection of drowsiness, where the Viola-Jones method is used for face recognition and cascade classifier is used for the detection of a contiguous sequence of eyes closed, which a reconsidered as drowsiness.La detección y predicción de la somnolencia es clave para la implementación de vehículos inteligentes destinados a prevenir accidentes en carreteras. Existen varios enfoques para crear este tipo de vehículos. En este artículo se analiza el enfoque de visión por computador, donde dispositivos embebidos son usados conjuntamente con técnicas de inteligencia artificial y reconocimiento de patrones para implementar soluciones para la detección del nivel de fatiga de un conductor de un vehículo. La mayoría de investigaciones en este campo basados en visión por computador se enfocan en el análisis del parpadeo de los ojos del conductor, esta solución combinada con patrones adicionales como el reconocimiento del bostezo o el movimiento de la cabeza constituye ser una solución bastante eficiente. El primer paso en este enfoque es el reconocimiento del rostro, para lo cual el uso del algoritmo AdaBoost muestra resultados precisos en el proceso de extracción de características, mientras para la detección de somnolencia, el uso de clasificadores como el Support Vector Machine (SVM) muestra también resultados prometedores.Un componente básico en la tecnología de visión por computador es el uso de una base de datos de imágenes espontaneas acorde al Sistema Codificado de Acciones Faciales (SCAF), con la cual el clasificador puede ser entrenado. Este artículo presenta un prototipo sencillo para detección de somnolencia, en el cual el método de Viola-Jones es utilizado para el reconocimiento de rostros y un clasificador tipo cascada es usado para la detección de ojos cerrados en una secuencia continua de imágenes lo que constituye un indicador de somnolencia
    corecore