1,224 research outputs found

    Searching for Best Predictors of Paralinguistic Comprehension and Production of Emotions in Communication in Adults With Moderate Intellectual Disability

    Get PDF
    Paralinguistic comprehension and production of emotions in communication include the skills of recognizing and interpreting emotional states with the help of facial expressions, prosody and intonation. In the relevant scientific literature, the skills of paralinguistic comprehension and production of emotions in communication are related primarily to receptive language abilities, although some authors found also their correlations with intellectual abilities and acoustic features of the voice. Therefore, the aim of this study was to investigate which of the mentioned variables (receptive language ability, acoustic features of voice, intellectual ability, social-demographic), presents the most relevant predictor of paralinguistic comprehension and paralinguistic production of emotions in communication in adults with moderate intellectual disabilities (MID). The sample included 41 adults with MID, 20ā€“49 years of age (M = 34.34, SD = 7.809), 29 of whom had MID of unknown etiology, while 12 had Down syndrome. All participants are native speakers of Serbian. Two subscales from The Assessment Battery for Communication ā€“ Paralinguistic comprehension of emotions in communication and Paralinguistic production of emotions in communication, were used to assess the examinees from the aspect of paralinguistic comprehension and production skills. For the graduation of examinees from the aspect of assumed predictor variables, the following instruments were used: Peabody Picture Vocabulary Test was used to assess receptive language abilities, Computerized Speech Lab (ā€œKay Elemetricsā€ Corp., model 4300) was used to assess acoustic features of voice, and Ravenā€™s Progressive Matrices were used to assess intellectual ability. Hierarchical regression analysis was applied to investigate to which extent the proposed variables present an actual predictor variables for paralinguistic comprehension and production of emotions in communication as dependent variables. The results of this analysis showed that only receptive language skills had statistically significant predictive value for paralinguistic comprehension of emotions (Ī² = 0.468, t = 2.236, p < 0.05), while the factor related to voice frequency and interruptions, form the domain of acoustic voice characteristics, displays predictive value for paralinguistic production of emotions (Ī² = 0.280, t = 2.076, p < 0.05). Consequently, this study, in the adult population with MID, evidenced a greater importance of voice and language in relation to intellectual abilities in understanding and producing emotions

    Early and late brain signatures of emotional prosody among individuals with high versus low power

    Get PDF
    Using ERPs, we explored the relationship between social power and emotional prosody processing. In particular, we investigated differences at early and late processing stages between individuals primed with high or low power. Comparable to previously published findings from nonprimed participants, individuals primed with low power displayed differentially modulated P2 amplitudes in response to different emotional prosodies, whereas participants primed with high power failed to do so. Similarly, participants primed with low power showed differentially modulated amplitudes in response to different emotional prosodies at a later processing stage (late ERP component), whereas participants primed with high power did not. These ERP results suggest that high versus low power leads to emotional prosody processing differences at the early stage associated with emotional salience detection and at a later stage associated with more in-depth processing of emotional stimuli

    Personalized face and gesture analysis using hierarchical neural networks

    Full text link
    The video-based computational analyses of human face and gesture signals encompass a myriad of challenging research problems involving computer vision, machine learning and human computer interaction. In this thesis, we focus on the following challenges: a) the classification of hand and body gestures along with the temporal localization of their occurrence in a continuous stream, b) the recognition of facial expressivity levels in people with Parkinson's Disease using multimodal feature representations, c) the prediction of student learning outcomes in intelligent tutoring systems using affect signals, and d) the personalization of machine learning models, which can adapt to subject and group-specific nuances in facial and gestural behavior. Specifically, we first conduct a quantitative comparison of two approaches to the problem of segmenting and classifying gestures on two benchmark gesture datasets: a method that simultaneously segments and classifies gestures versus a cascaded method that performs the tasks sequentially. Second, we introduce a framework that computationally predicts an accurate score for facial expressivity and validate it on a dataset of interview videos of people with Parkinson's disease. Third, based on a unique dataset of videos of students interacting with MathSpring, an intelligent tutoring system, collected by our collaborative research team, we build models to predict learning outcomes from their facial affect signals. Finally, we propose a novel solution to a relatively unexplored area in automatic face and gesture analysis research: personalization of models to individuals and groups. We develop hierarchical Bayesian neural networks to overcome the challenges posed by group or subject-specific variations in face and gesture signals. We successfully validate our formulation on the problems of personalized subject-specific gesture classification, context-specific facial expressivity recognition and student-specific learning outcome prediction. We demonstrate the flexibility of our hierarchical framework by validating the utility of both fully connected and recurrent neural architectures

    A system for recognizing human emotions based on speech analysis and facial feature extraction: applications to Human-Robot Interaction

    Get PDF
    With the advance in Artificial Intelligence, humanoid robots start to interact with ordinary people based on the growing understanding of psychological processes. Accumulating evidences in Human Robot Interaction (HRI) suggest that researches are focusing on making an emotional communication between human and robot for creating a social perception, cognition, desired interaction and sensation. Furthermore, robots need to receive human emotion and optimize their behavior to help and interact with a human being in various environments. The most natural way to recognize basic emotions is extracting sets of features from human speech, facial expression and body gesture. A system for recognition of emotions based on speech analysis and facial features extraction can have interesting applications in Human-Robot Interaction. Thus, the Human-Robot Interaction ontology explains how the knowledge of these fundamental sciences is applied in physics (sound analyses), mathematics (face detection and perception), philosophy theory (behavior) and robotic science context. In this project, we carry out a study to recognize basic emotions (sadness, surprise, happiness, anger, fear and disgust). Also, we propose a methodology and a software program for classification of emotions based on speech analysis and facial features extraction. The speech analysis phase attempted to investigate the appropriateness of using acoustic (pitch value, pitch peak, pitch range, intensity and formant), phonetic (speech rate) properties of emotive speech with the freeware program PRAAT, and consists of generating and analyzing a graph of speech signals. The proposed architecture investigated the appropriateness of analyzing emotive speech with the minimal use of signal processing algorithms. 30 participants to the experiment had to repeat five sentences in English (with durations typically between 0.40 s and 2.5 s) in order to extract data relative to pitch (value, range and peak) and rising-falling intonation. Pitch alignments (peak, value and range) have been evaluated and the results have been compared with intensity and speech rate. The facial feature extraction phase uses the mathematical formulation (B\ue9zier curves) and the geometric analysis of the facial image, based on measurements of a set of Action Units (AUs) for classifying the emotion. The proposed technique consists of three steps: (i) detecting the facial region within the image, (ii) extracting and classifying the facial features, (iii) recognizing the emotion. Then, the new data have been merged with reference data in order to recognize the basic emotion. Finally, we combined the two proposed algorithms (speech analysis and facial expression), in order to design a hybrid technique for emotion recognition. Such technique have been implemented in a software program, which can be employed in Human-Robot Interaction. The efficiency of the methodology was evaluated by experimental tests on 30 individuals (15 female and 15 male, 20 to 48 years old) form different ethnic groups, namely: (i) Ten adult European, (ii) Ten Asian (Middle East) adult and (iii) Ten adult American. Eventually, the proposed technique made possible to recognize the basic emotion in most of the cases

    Prosody and Kinesics Based Co-analysis Towards Continuous Gesture Recognition

    Get PDF
    The aim of this study is to develop a multimodal co-analysis framework for continuous gesture recognition by exploiting prosodic and kinesics manifestation of natural communication. Using this framework, a co-analysis pattern between correlating components is obtained. The co-analysis pattern is clustered using K-means clustering to determine how well the pattern distinguishes the gestures. Features of the proposed approach that differentiate it from the other models are its less susceptibility to idiosyncrasies, its scalability, and simplicity. The experiment was performed on Multimodal Annotated Gesture Corpus (MAGEC) that we created for research on understanding non-verbal communication community, particularly the gestures
    • ā€¦
    corecore