5 research outputs found

    Reconhecimento de expressões faciais compostas em imagens 3D : ambiente forçado vs ambiente espontâneo

    Get PDF
    Orientadora: Profa. Dra. Olga Regina Pereira BellonDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 16/12/2017Inclui referências: p.56-60Área de concentração: Ciência da ComputaçãoResumo: Neste trabalho, realiza-se o reconhecimento de Expressões Faciais Compostas (EFCs), em imagens 3D, nos ambientes de captura: forçado e espontâneo. Explora-se assim, uma moderna categorização de expressões faciais, diferente das expressões faciais básicas, por ser construída pela combinação de duas expressões básicas. A pesquisa se orienta através da utilização de imagens 3D por conta de suas vantagens intrínsecas: não apresentam problemas decorrentes de variações de pose, iluminação e de outras mudanças na aparência facial. Consideram-se dois ambientes de captura de expressões: forçado (quando o sujeito é instruído para realizar a expressão) e espontâneo (quando o sujeito produz a expressão por meio de estímulos). Isto, com a intenção de comparar o comportamento dos dois em relação ao reconhecimento de EFCs, já que, diferem em várias dimensões, incluindo dentre elas: complexidade, temporalidade e intensidade. Por fim, propõe-se um método para reconhecer EFCs. O método em questão representa uma nova aplicação de detectores de movimentos dos músculos faciais já existentes. Esses movimentos faciais detectar são denotados no sistema de codificação de ação facial (FACS) como Unidades de Ação (AUs). Consequentemente, implementam-se detectores de AUs em imagens 3D baseados em padrões binários locais de profundidade (LDBP). Posteriormente, o método foi aplicado em duas bases de dados públicas com imagens 3D: Bosphorus (ambiente forçado) e BP4D-Spontaneus (ambiente espontâneo). Nota-se que o método desenvolvido não diferencia as EFCs que apresentam a mesma configuração de AUs, sendo estas: "felicidade com nojo", "horror" e "impressão", por conseguinte, considera-se essas expressões como um "caso especial". Portanto, ponderaram-se 14 EFCs, mais o "caso especial" e imagens sem EFCs. Resultados obtidos evidenciam a existência de EFCs em imagens 3D, das quais aproveitaramse algumas características. Além disso, notou-se que o ambiente espontâneo, teve melhor comportamento em reconhecer EFCs tanto pelas AUs anotadas na base, quanto pelas AUs detectadas automaticamente; reconhecendo mais casos de EFCs e com melhor desempenho. Pelo nosso conhecimento, esta é a primeira vez que EFCs são investigadas em imagens 3D. Palavras-chave: Expressões faciais compostas, FACS, Detecção de AUs, Ambiente forçado, Ambiente espontâneo.Abstract: The following research investigates Compound Facial Expressions (EFCs) in 3D images captured in the domains: forced and spontaneous. The work explores a modern categorization of facial expressions, different than basic facial expressions, but constructed by the combination of two basic categories of emotion. The investigation used 3D images because of their intrinsic advantages: they do not present problems due to variations in pose, lighting and other changes in facial appearance. For this purpose, this research considers both forced (when the subject is instructed to perform the expression) and spontaneous (when the subject produces the expression by means of stimuli) expression caption domains. This has the intention of comparing the behavior of both domains by analyzing the recognition of EFCs, because they differ in many dimentions, including complexity, time and intensity. Finally, a method for EFCs recognition is proposed. The method in question represents a new application of existing detectors of facial muscle movements. These facial movimentes to detect are denoted in the Facial Action Coding System (FACS) as Action Units (AUs). Consequently, 3D Facial AUs Detectors are developed based on Local Depth Binary Patterns (LDBP). Subsequently, the method was applied to two public databases with 3D images: Bosphorus (forced domain) and BP4D-Spontaneous (spontaneous domain). Note that the developed method does not differentiate the EFCs that present the same AU configuration: "sadly disgusted", "appalled" and "hateful", therefore, these expressions are considered a "special case". Thus, 14 EFCs are observed, plus the "special case" and the non-EFCs images. The results confirm the existence of EFCs in 3D images, from which some characteristics were exploit. In addition, noticed that the spontaneous environment was better at recognizing EFCs by the AUs annotated at the database and by the AUs detected; recognizing more cases of EFCs and with better performance. From our best knowledge, this is the first time that EFCs are explored for 3D images. Keywords: Coumpound facial expression, FACS, AUs detection, posed domain, spontaneous domain

    Discriminant Multi-Label Manifold Embedding for Facial Action Unit Detection

    Get PDF
    This article describes a system for participation in the Facial Expression Recognition and Analysis (FERA2015) sub-challenge for spontaneous action unit occurrence detection. The problem of AU detection is a multi-label classification problem by its nature, which is a fact overseen by most existing work. The correlation information between AUs has the potential of increasing the detection accuracy.We investigate the multi-label AU detection problem by embedding the data on low dimensional manifolds which preserve multi-label correlation. For this, we apply the multi-label Discriminant Laplacian Embedding (DLE) method as an extension to our base system. The system uses SIFT features around a set of facial landmarks that is enhanced with the use of additional non-salient points around transient facial features. Both the base system and the DLE extension show better performance than the challenge baseline results for the two databases in the challenge, and achieve close to 50% as F1-measure on the testing partition in average (9.9% higher than the baseline, in the best case). The DLE extension proves useful for certain AUs, but also shows the need for more analysis to assess the benefits in general

    Automatisierte Erkennung und Evaluation von therapeutischen Übungen für Patienten mit Mimikdysfunktionen

    Get PDF
    In dieser Arbeit wird ein flexibles, kamerabasiertes Trainingssystem zur Rehabilitation von Gesichtslähmungen (Fazialisparesen) und anderen Mimikdysfunktionen vorgestellt. Das System unterstützt das selbstständige Training des Patienten, indem es die Durchführung von insgesamt zwölf Fazialisübungen automatisch bewertet und mehrstufiges Feedback an den Anwender vermittelt. Es eignet sich somit für einen begleitenden Einsatz zu den regulären Übungseinheiten, welche von einem Logopäden oder Sprechwissenschaftler angeleitet werden. Während Ansätze zur automatisierten Diagnose und Gradierung von Fazialisparesen in der Literatur vergleichsweise verbreitet sind, finden sich gegenwärtig nur vereinzelt Konzepte für therapiebegleitende Trainingsanwendungen. Die diesen Anwendungen zu Grunde liegenden Algorithmen sind zudem auf einzelne Fazialisübungen spezialisiert und daher, anders als das in dieser Arbeit vorgestellte System, nicht ohne Mehraufwand auf weitere Übungen übertragbar. Die Beiträge der vorliegenden Arbeit umfassen die wesentlichen Komponenten der technischen Gesamtarchitektur des Trainingssystems. Der methodische und experimentelle Fokus der Ausarbeitung liegt dabei vor allem auf der Merkmalsextraktion, sowie der Ableitung des Feedbacks aus den extrahierten Merkmalsdeskriptoren. Eine wesentliche Neuheit gegenüber dem Stand der Technik besteht in der Möglichkeit, das Trainingssystem flexibel um zusätzliche Fazialisübungen zu ergänzen und sowohl globales als auch regionenbezogenes Feedback bereitzustellen. Die dafür ausgewählten Verfahren basieren vorwiegend auf der Verarbeitung von 3D-Kameradaten und umfassen die Extraktion von Punktsignaturen, Histogrammen orientierter Normalenvektoren, sowie von Krümmungs-, Distanz- und Winkelmerkmalsdeskriptoren. Die Feedbackermittlung stützt sich auf den Einsatz von Random-Forests und den aus diesen ableitbaren paarweisen Ähnlichkeiten. Letztere stellen Schätzwerte für die merkmalsbezogene Übereinstimmung zwischen der vom Patienten ausgeführten Übung und den Modelldurchführungen in den Trainingsdaten dar.This thesis presents an automated, camera-based training system employable for the therapy of facial paralysis and related muscle dysfunctions. The proposed system aims to support patients in conducting twelve different facial exercises by providing automatically generated feedback. Thus, it is suited to supplement individual exercise sessions that are not supervised by a therapist. Automated grading and diagnosis systems for facial paralysis are a prominent topic in the literature on clinical image processing. In contrast, only few papers deal with the development of automated training systems for facial muscle re-education. Furthermore, the underlying algorithms are typically specialized for particular facial exercises and difficult to adapt to additional requirements. The contributions of this thesis comprise the main components of the system architecture with a methodical and technical emphasis on feature extraction algorithms and feedback estimation methods. Regarding the state-of-the-art, the major novelty is embodied in the possibility to easily extend the system to additional exercises and in the derivation of global and local feedback. The selected approaches rely on processing of 3D-camera data and include the extraction of point signatures, histograms of oriented normal vectors, curvatures, distance, and angle features. The feedback generation is based on random forest classifiers and proximities derived from trained forests. These proximities provide an estimate of similarity between the patient sample and training data samples

    Individual and Inter-related Action Unit Detection in Videos for Affect Recognition

    Get PDF
    The human face has evolved to become the most important source of non-verbal information that conveys our affective, cognitive and mental state to others. Apart from human to human communication facial expressions have also become an indispensable component of human-machine interaction (HMI). Systems capable of understanding how users feel allow for a wide variety of applications in medical, learning, entertainment and marketing technologies in addition to advancements in neuroscience and psychology research and many others. The Facial Action Coding System (FACS) has been built to objectively define and quantify every possible facial movement through what is called Action Units (AU), each representing an individual facial action. In this thesis we focus on the automatic detection and exploitation of these AUs using novel appearance representation techniques as well as incorporation of the prior co-occurrence information between them. Our contributions can be grouped in three parts. In the first part, we propose to improve the detection accuracy of appearance features based on local binary patterns (LBP) for AU detection in videos. For this purpose, we propose two novel methodologies. The first one uses three fundamental image processing tools as a pre-processing step prior to the application of the LBP transform on the facial texture. These tools each enhance the descriptive ability of LBP by emphasizing different transient appearance characteristics, and are proven to increase the AU detection accuracy significantly in our experiments. The second one uses multiple local curvature Gabor binary patterns (LCGBP) for the same problem and achieves state-of-the-art performance on a dataset of mostly posed facial expressions. The curvature information of the face, as well as the proposed multiple filter size scheme is very effective in recognizing these individual facial actions. In the second part, we propose to take advantage of the co-occurrence relation between the AUs, that we can learn through training examples. We use this information in a multi-label discriminant Laplacian embedding (DLE) scheme to train our system with SIFT features extracted around the salient and transient landmarks on the face. The system is first validated on a challenging (containing lots of occlusions and head pose variations) dataset without the DLE, then we show the performance of the full system on the FERA 2015 challenge on AU occurence detection. The challenge consists of two difficult datasets that contain spontaneous facial actions at different intensities. We demonstrate that our proposed system achieves the best results on these datasets for detecting AUs. The third and last part of the thesis contains an application on how this automatic AU detection system can be used in real-life situations, particularly for detecting cognitive distraction. Our contribution in this part is two-fold: First, we present a novel visual database of people driving a simulator while being induced visual and cognitive distraction via secondary tasks. The subjects have been recorded using three near-infrared camera-lighting systems, which makes it a very suitable configuration to use in real driving conditions, i.e. with large head pose and ambient light variations. Secondly, we propose an original framework to automatically discriminate cognitive distraction sequences from baseline sequences by extracting features from continuous AU signals and by exploiting the cross-correlations between them. We achieve a very high classification accuracy in our subject-based experiments and a lower yet acceptable performance for the subject-independent tests. Based on these results we discuss how facial expressions related to this complex mental state are individual, rather than universal, and also how the proposed system can be used in a vehicle to help decrease human error in traffic accidents
    corecore