408 research outputs found

    Reconhecimento de padrões em expressões faciais : algoritmos e aplicações

    Get PDF
    Orientador: Hélio PedriniTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de emoções tem-se tornado um tópico relevante de pesquisa pela comunidade científica, uma vez que desempenha um papel essencial na melhoria contínua dos sistemas de interação humano-computador. Ele pode ser aplicado em diversas áreas, tais como medicina, entretenimento, vigilância, biometria, educação, redes sociais e computação afetiva. Há alguns desafios em aberto relacionados ao desenvolvimento de sistemas emocionais baseados em expressões faciais, como dados que refletem emoções mais espontâneas e cenários reais. Nesta tese de doutorado, apresentamos diferentes metodologias para o desenvolvimento de sistemas de reconhecimento de emoções baseado em expressões faciais, bem como sua aplicabilidade na resolução de outros problemas semelhantes. A primeira metodologia é apresentada para o reconhecimento de emoções em expressões faciais ocluídas baseada no Histograma da Transformada Census (CENTRIST). Expressões faciais ocluídas são reconstruídas usando a Análise Robusta de Componentes Principais (RPCA). A extração de características das expressões faciais é realizada pelo CENTRIST, bem como pelos Padrões Binários Locais (LBP), pela Codificação Local do Gradiente (LGC) e por uma extensão do LGC. O espaço de características gerado é reduzido aplicando-se a Análise de Componentes Principais (PCA) e a Análise Discriminante Linear (LDA). Os algoritmos K-Vizinhos mais Próximos (KNN) e Máquinas de Vetores de Suporte (SVM) são usados para classificação. O método alcançou taxas de acerto competitivas para expressões faciais ocluídas e não ocluídas. A segunda é proposta para o reconhecimento dinâmico de expressões faciais baseado em Ritmos Visuais (VR) e Imagens da História do Movimento (MHI), de modo que uma fusão de ambos descritores codifique informações de aparência, forma e movimento dos vídeos. Para extração das características, o Descritor Local de Weber (WLD), o CENTRIST, o Histograma de Gradientes Orientados (HOG) e a Matriz de Coocorrência em Nível de Cinza (GLCM) são empregados. A abordagem apresenta uma nova proposta para o reconhecimento dinâmico de expressões faciais e uma análise da relevância das partes faciais. A terceira é um método eficaz apresentado para o reconhecimento de emoções audiovisuais com base na fala e nas expressões faciais. A metodologia envolve uma rede neural híbrida para extrair características visuais e de áudio dos vídeos. Para extração de áudio, uma Rede Neural Convolucional (CNN) baseada no log-espectrograma de Mel é usada, enquanto uma CNN construída sobre a Transformada de Census é empregada para a extração das características visuais. Os atributos audiovisuais são reduzidos por PCA e LDA, então classificados por KNN, SVM, Regressão Logística (LR) e Gaussian Naïve Bayes (GNB). A abordagem obteve taxas de reconhecimento competitivas, especialmente em dados espontâneos. A penúltima investiga o problema de detectar a síndrome de Down a partir de fotografias. Um descritor geométrico é proposto para extrair características faciais. Experimentos realizados em uma base de dados pública mostram a eficácia da metodologia desenvolvida. A última metodologia trata do reconhecimento de síndromes genéticas em fotografias. O método visa extrair atributos faciais usando características de uma rede neural profunda e medidas antropométricas. Experimentos são realizados em uma base de dados pública, alcançando taxas de reconhecimento competitivasAbstract: Emotion recognition has become a relevant research topic by the scientific community, since it plays an essential role in the continuous improvement of human-computer interaction systems. It can be applied in various areas, for instance, medicine, entertainment, surveillance, biometrics, education, social networks, and affective computing. There are some open challenges related to the development of emotion systems based on facial expressions, such as data that reflect more spontaneous emotions and real scenarios. In this doctoral dissertation, we propose different methodologies to the development of emotion recognition systems based on facial expressions, as well as their applicability in the development of other similar problems. The first is an emotion recognition methodology for occluded facial expressions based on the Census Transform Histogram (CENTRIST). Occluded facial expressions are reconstructed using an algorithm based on Robust Principal Component Analysis (RPCA). Extraction of facial expression features is then performed by CENTRIST, as well as Local Binary Patterns (LBP), Local Gradient Coding (LGC), and an LGC extension. The generated feature space is reduced by applying Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for classification. This method reached competitive accuracy rates for occluded and non-occluded facial expressions. The second proposes a dynamic facial expression recognition based on Visual Rhythms (VR) and Motion History Images (MHI), such that a fusion of both encodes appearance, shape, and motion information of the video sequences. For feature extraction, Weber Local Descriptor (WLD), CENTRIST, Histogram of Oriented Gradients (HOG), and Gray-Level Co-occurrence Matrix (GLCM) are employed. This approach shows a new direction for performing dynamic facial expression recognition, and an analysis of the relevance of facial parts. The third is an effective method for audio-visual emotion recognition based on speech and facial expressions. The methodology involves a hybrid neural network to extract audio and visual features from videos. For audio extraction, a Convolutional Neural Network (CNN) based on log Mel-spectrogram is used, whereas a CNN built on Census Transform is employed for visual extraction. The audio and visual features are reduced by PCA and LDA, and classified through KNN, SVM, Logistic Regression (LR), and Gaussian Naïve Bayes (GNB). This approach achieves competitive recognition rates, especially in a spontaneous data set. The second last investigates the problem of detecting Down syndrome from photographs. A geometric descriptor is proposed to extract facial features. Experiments performed on a public data set show the effectiveness of the developed methodology. The last methodology is about recognizing genetic disorders in photos. This method focuses on extracting facial features using deep features and anthropometric measurements. Experiments are conducted on a public data set, achieving competitive recognition ratesDoutoradoCiência da ComputaçãoDoutora em Ciência da Computação140532/2019-6CNPQCAPE

    Design and development of facial recognitionbased library management system (FRLMS)

    Get PDF
    In this paper we propose a facial recognitionbased library management system namely Facial Recognition based Library Management System (FRLMS). This system aims to improve the user experience on library authentication process through facial recognition algorithm. This process would be simple and efficient as the authentication process is performed seamlessly. For the purpose of this study, feature extraction and image classification are obtained using OpenCV and TensorFlow, where the average recognition accuracy reaches up to 92.15

    Pengenalan Emosi Wajah Manusia Menggunakan Biorthogonal Wavelet Entropy dan Support Vector Machine

    Get PDF
    Pengenalan emosi di dalam suatu interaksi merupakan kunci sukses dalam interaksi tersebut. Oleh karena itu, penelitian mengenai cara komputer mengenali emosi manusia perlu dilakukan. Data yang memiliki dimensi yang tinggi sulit untuk diklasifikasi. Oleh karena itu, reduksi dimensi perlu dilakukan. Tugas Akhir ini bertujuan untuk meneliti sistem pengenalan emosi berdasarkan ekspresi wajah manusia dalam kasus pereduksian dimensi. Kategori emosi yang akan dikenali adalah marah, senang, sedih, takut, jijik, terkejut, dan netral. Untuk mengenali emosi tersebut, digunakan metode Biorthogonal Wavelet Entropy (BWE) sebagai metode ekstraksi fitur dan reduksi dimensi, dan Multi-class Support Vector Machine (MSVM) sebagai metode klasifikasi. Hasil implementasi sistem pada dataset JAFFE menunjukkan bahwa entropy pada BWE tidak berhasil mereduksi dimensi coefficient subband hasil dari dekomposisi Biorthogonal Wavelet Transform. Akurasi tertinggi yang didapatkan BWE adalah 44.45%. Saat entropy pada BWE tidak digunakan, akurasi tertinggi yang didapatkan adalah 82.73%

    Ubiquitous Technologies for Emotion Recognition

    Get PDF
    Emotions play a very important role in how we think and behave. As such, the emotions we feel every day can compel us to act and influence the decisions and plans we make about our lives. Being able to measure, analyze, and better comprehend how or why our emotions may change is thus of much relevance to understand human behavior and its consequences. Despite the great efforts made in the past in the study of human emotions, it is only now, with the advent of wearable, mobile, and ubiquitous technologies, that we can aim to sense and recognize emotions, continuously and in real time. This book brings together the latest experiences, findings, and developments regarding ubiquitous sensing, modeling, and the recognition of human emotions

    Automatic emotion perception using eye movement information for E-Healthcare systems.

    Get PDF
    Facing the adolescents and detecting their emotional state is vital for promoting rehabilitation therapy within an E-Healthcare system. Focusing on a novel approach for a sensor-based E-Healthcare system, we propose an eye movement information-based emotion perception algorithm by collecting and analyzing electrooculography (EOG) signals and eye movement video synchronously. Specifically, we extract the time-frequency eye movement features by firstly applying the short-time Fourier transform (STFT) to raw multi-channel EOG signals. Subsequently, in order to integrate time domain eye movement features (i.e., saccade duration, fixation duration, and pupil diameter), we investigate two feature fusion strategies: feature level fusion (FLF) and decision level fusion (DLF). Recognition experiments have been also performed according to three emotional states: positive, neutral, and negative. The average accuracies are 88.64% (the FLF method) and 88.35% (the DLF with maximal rule method), respectively. Experimental results reveal that eye movement information can effectively reflect the emotional state of the adolescences, which provides a promising tool to improve the performance of the E-Healthcare system.Anhui Provincial Natural Science Research Project of Colleges and Universities Fund under Grant KJ2018A0008, Open Fund for Discipline Construction under Grant Institute of Physical Science and Information Technology in Anhui University, and National Natural Science Fund of China under Grant 61401002

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Affect Analysis and Membership Recognition in Group Settings

    Get PDF
    PhD ThesisEmotions play an important role in our day-to-day life in various ways, including, but not limited to, how we humans communicate and behave. Machines can interact with humans more naturally and intelligently if they are able to recognise and understand humans’ emotions and express their own emotions. To achieve this goal, in the past two decades, researchers have been paying a lot of attention to the analysis of affective states, which has been studied extensively across various fields, such as neuroscience, psychology, cognitive science, and computer science. Most of the existing works focus on affect analysis in individual settings, where there is one person in an image or in a video. However, in the real world, people are very often with others, or interact in group settings. In this thesis, we will focus on affect analysis in group settings. Affect analysis in group settings is different from that in individual settings and provides more challenges due to dynamic interactions between the group members, various occlusions among people in the scene, and the complex context, e.g., who people are with, where people are staying and the mutual influences among people in the group. Because of these challenges, there are still a number of open issues that need further investigation in order to advance the state of the art, and explore the methodologies for affect analysis in group settings. These open topics include but are not limited to (1) is it possible to transfer the methods used for the affect recognition of a person in individual settings to the affect recognition of each individual in group settings? (2) is it possible to recognise the affect of one individual using the expressed behaviours of another member in the same group (i.e., cross-subject affect recognition)? (3) can non-verbal behaviours be used for the recognition of contextual information in group settings? In this thesis, we investigate the affect analysis in group settings and propose methods to explore the aforementioned research questions step by step. Firstly, we propose a method for individual affect recognition in both individual and group videos, which is also used for social context prediction, i.e., whether a person is alone or within a group. Secondly, we introduce a novel framework for cross-subject affect analysis in group videos. Specifically, we analyse the correlation of the affect among group members and investigate the automatic recognition of the affect of one subject using the behaviours expressed by another subject in the same group or in a different group. Furthermore, we propose methods for contextual information prediction in group settings, i.e., group membership recognition - to recognise which group of the person belongs. Comprehensive experiments are conducted using two datasets that one contains individual videos and one contains group videos. The experimental results show that (1) the methods used for affect recognition of a person in individual settings can be transferred to group settings; (2) the affect of one subject in a group can be better predicted using the expressive behaviours of another subject within the same group than using that of a subject from a different group; and (3) contextual information (i.e., whether a person is staying alone or within a group, and group membership) can be predicted successfully using non-verbal behaviours

    Sensor Technologies to Manage the Physiological Traits of Chronic Pain: A Review

    Get PDF
    Non-oncologic chronic pain is a common high-morbidity impairment worldwide and acknowledged as a condition with significant incidence on quality of life. Pain intensity is largely perceived as a subjective experience, what makes challenging its objective measurement. However, the physiological traces of pain make possible its correlation with vital signs, such as heart rate variability, skin conductance, electromyogram, etc., or health performance metrics derived from daily activity monitoring or facial expressions, which can be acquired with diverse sensor technologies and multisensory approaches. As the assessment and management of pain are essential issues for a wide range of clinical disorders and treatments, this paper reviews different sensor-based approaches applied to the objective evaluation of non-oncological chronic pain. The space of available technologies and resources aimed at pain assessment represent a diversified set of alternatives that can be exploited to address the multidimensional nature of pain.Ministerio de Economía y Competitividad (Instituto de Salud Carlos III) PI15/00306Junta de Andalucía PIN-0394-2017Unión Europea "FRAIL
    corecore