159,602 research outputs found

    MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition

    Full text link
    Despite significant progress, speech emotion recognition (SER) remains challenging due to inherent complexity and ambiguity of the emotion attribute, particularly in wild world. Whereas current studies primarily focus on recognition and generalization capabilities, this work pioneers an exploration into the reliability of SER methods and investigates how to model the speech emotion from the aspect of data distribution across various speech attributes. Specifically, we first build a novel CNN-based SER model which adopts additive margin softmax loss to expand the distance between features of different classes, thereby enhancing their discrimination. Second, a novel multiple speech attribute control method MSAC is proposed to explicitly control speech attributes, enabling the model to be less affected by emotion-agnostic attributes and capture more fine-grained emotion-related features. Third, we make a first attempt to test and analyze the reliability of the proposed SER workflow using the out-of-distribution detection method. Extensive experiments on both single and cross-corpus SER scenarios show that our proposed unified SER workflow consistently outperforms the baseline in terms of recognition, generalization, and reliability performance. Besides, in single-corpus SER, the proposed SER workflow achieves superior recognition results with a WAR of 72.97\% and a UAR of 71.76\% on the IEMOCAP corpus.Comment: 5 page

    Multimodal emotion recognition based on the fusion of vision, EEG, ECG, and EMG signals

    Get PDF
    This paper presents a novel approach for emotion recognition (ER) based on Electroencephalogram (EEG), Electromyogram (EMG), Electrocardiogram (ECG), and computer vision. The proposed system includes two different models for physiological signals and facial expressions deployed in a real-time embedded system. A custom dataset for EEG, ECG, EMG, and facial expression was collected from 10 participants using an Affective Video Response System. Time, frequency, and wavelet domain-specific features were extracted and optimized, based on their Visualizations from Exploratory Data Analysis (EDA) and Principal Component Analysis (PCA). Local Binary Patterns (LBP), Local Ternary Patterns (LTP), Histogram of Oriented Gradients (HOG), and Gabor descriptors were used for differentiating facial emotions. Classification models, namely decision tree, random forest, and optimized variants thereof, were trained using these features. The optimized Random Forest model achieved an accuracy of 84%, while the optimized Decision Tree achieved 76% for the physiological signal-based model. The facial emotion recognition (FER) model attained an accuracy of 84.6%, 74.3%, 67%, and 64.5% using K-Nearest Neighbors (KNN), Random Forest, Decision Tree, and XGBoost, respectively. Performance metrics, including Area Under Curve (AUC), F1 score, and Receiver Operating Characteristic Curve (ROC), were computed to evaluate the models. The outcome of both results, i.e., the fusion of bio-signals and facial emotion analysis, is given to a voting classifier to get the final emotion. A comprehensive report is generated using the Generative Pretrained Transformer (GPT) language model based on the resultant emotion, achieving an accuracy of 87.5%. The model was implemented and deployed on a Jetson Nano. The results show its relevance to ER. It has applications in enhancing prosthetic systems and other medical fields such as psychological therapy, rehabilitation, assisting individuals with neurological disorders, mental health monitoring, and biometric security

    Role of emotion in information retrieval

    Get PDF
    The main objective of Information Retrieval (IR) systems is to satisfy searchers’ needs. A great deal of research has been conducted in the past to attempt to achieve a better insight into searchers’ needs and the factors that can potentially influence the success of an Information Retrieval and Seeking (IR&S) process. One of the factors which has been considered is searchers’ emotion. It has been shown in previous research that emotion plays an important role in the success of an IR&S process, which has the purpose of satisfying an information need. However, these previous studies do not give a sufficiently prominent position to emotion in IR, since they limit the role of emotion to a secondary factor, by assuming that a lack of knowledge (the need for information) is the primary factor (the motivation of the search). In this thesis, we propose to treat emotion as the principal factor in the system of needs of a searcher, and therefore one that ought to be considered by the retrieval algorithms. We present a more realistic view of searchers’ needs by considering not only theories from information retrieval and science, but also from psychology, philosophy, and sociology. We extensively report on the role of emotion in every aspect of human behaviour, both at an individual and social level. This serves not only to modify the current IR views of emotion, but more importantly to uncover social situations where emotion is the primary factor (i.e., source of motivation) in an IR&S process. We also show that the emotion aspect of documents plays an important part in satisfying the searcher’s need, in particular when emotion is indeed a primary factor. Given the above, we define three concepts, called emotion need, emotion object and emotion relevance, and present a conceptual map that utilises these concepts in IR tasks and scenarios. In order to investigate the practical concepts such as emotion object and emotion relevance in a real-life application, we first study the possibility of extracting emotion from text, since this is the first pragmatic challenge to be solved before any IR task can be tackled. For this purpose, we developed a text-based emotion extraction system and demonstrate that it outperforms other available emotion extraction approaches. Using the developed emotion extraction system, the usefulness of the practical concepts mentioned above is studied in two scenarios: movie recommendation and news diversification. In the movie recommendation scenario, two collaborative filtering (CF) models were proposed. CF systems aim to recommend items to a user, based on the information gathered from other users who have similar interests. CF techniques do not handle data sparsity well, especially in the case of the cold start problem, where there is no past rating for an item. In order to predict the rating of an item for a given user, the first and second models rely on an extension of state-of-the-art memory-based and model-based CF systems. The features used by the models are two emotion spaces extracted from the movie plot summary and the reviews made by users, and three semantic spaces, namely, actor, director, and genre. Experiments with two MovieLens datasets show that the inclusion of emotion information significantly improves the accuracy of prediction when compared with the state-of-the-art CF techniques, and also tackles data sparsity issues. In the news retrieval scenario, a novel way of diversifying results, i.e., diversifying based on the emotion aspect of documents, is proposed. For this purpose, two approaches are introduced to consider emotion features for diversification, and they are empirically tested on the TREC 678 Interactive Track collection. The results show that emotion features are capable of enhancing retrieval effectiveness. Overall, this thesis shows that emotion plays a key role in IR and that its importance needs to be considered. At a more detailed level, it illustrates the crucial part that emotion can play in • searchers, both as a primary (emotion need) and secondary factor (influential role) in an IR&S process; • enhancing the representation of a document using emotion features (emotion object); and finally, • improving the effectiveness of IR systems at satisfying searchers’ needs (emotion relevance)

    Multimodal Affective Communication Analysis: Fusing Speech Emotion and Text Sentiment Using Machine Learning

    Get PDF
    © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)Affective communication, encompassing verbal and non-verbal cues, is crucial for understanding human interactions. This study introduces a novel framework for enhancing emotional understanding by fusing speech emotion recognition (SER) and sentiment analysis (SA). We leverage diverse features and both classical and deep learning models, including Gaussian naive Bayes (GNB), support vector machines (SVMs), random forests (RFs), multilayer perceptron (MLP), and a 1D convolutional neural network (1D-CNN), to accurately discern and categorize emotions in speech. We further extract text sentiment from speech-to-text conversion, analyzing it using pre-trained models like bidirectional encoder representations from transformers (BERT), generative pre-trained transformer 2 (GPT-2), and logistic regression (LR). To improve individual model performance for both SER and SA, we employ an extended dynamic Bayesian mixture model (DBMM) ensemble classifier. Our most significant contribution is the development of a novel two-layered DBMM (2L-DBMM) for multimodal fusion. This model effectively integrates speech emotion and text sentiment, enabling the classification of more nuanced, second-level emotional states. Evaluating our framework on the EmoUERJ (Portuguese) and ESD (English) datasets, the extended DBMM achieves accuracy rates of 96% and 98% for SER, 85% and 95% for SA, and 96% and 98% for combined emotion classification using the 2L-DBMM, respectively. Our findings demonstrate the superior performance of the extended DBMM for individual modalities compared to individual classifiers and the 2L-DBMM for merging different modalities, highlighting the value of ensemble methods and multimodal fusion in affective communication analysis. The results underscore the potential of our approach in enhancing emotional understanding with broad applications in fields like mental health assessment, human–robot interaction, and cross-cultural communication.Peer reviewe

    Creating Stories From Parents' Premature Birth Experiences to Engender Empathy in Nursing Students

    Get PDF
    Introduction: In healthcare, stories are an evocative way to educate nurses about the emotional experiences of patients. Little is known however, about the impact of storytelling in neonatal nursing practice and education. Aims: The aims were; to explore how parents of premature babies described their neonatal care experience; to develop digital stories informed by their narratives and to investigate how these may contribute to empathic learning in nursing students/staff. Methods: Within an interpretive, narrative design using principles from constructivism, twenty narrative interviews with parents of premature babies were undertaken to collect their stories. Core story creation reconfigured the raw narratives to develop digital stories using the ASPIRE model. Thematic and metaphor analysis were also applied. Finally, a mixed methods approach investigated the perceived value of the stories for empathic learning with nursing students and staff in neonatal care. Findings: Parents described their experience using a strong emotional narrative revealing important learning points for those caring for them. The use of metaphors was a common way to express emotion. Frequent metaphor clusters provided pivotal themes for the creation of digital stories. Four key themes emerged from the analysis: namely, the effect of digital stories on emotion and empathy, the perceived value of digital stories for learning and knowledge acquisition, the potential impact of digital stories on practice and the format of digital stories for representing emotion and evoking empathy. Overall, student nurses and staff evaluated the digital stories positively. It was clear that they were an effective way to teach others about emotional experiences of parents and had the potential to enhance empathy. Many participants indicated that stories may influence their practice by enhancing understanding of the emotional needs of parents. Discussion: Digital stories appear to be an effective and evocative way of telling the stories of others and depicting their emotional experience from which we can learn. Emotions can be a source of knowledge, and digital stories representing the parents’ experience and emotions may enrich empathic learning. Conclusion: Value is placed on parent stories by students and staff in relation to enhancing empathic learning within the neonatal field. Digital stories can be one way of teaching emotional aspects of care that places the parents at the centre

    Interpersonal Emotion Regulation Questionnaire (IERQ): scale development and psychometric characteristics

    Full text link
    Despite the popularity of emotion regulation in the contemporary literature, research has almost exclusively focused on only intrapersonal processes, whereas much less attention has been placed in interpersonal emotion regulation processes. In order to encourage research on interpersonal emotion regulation, we present a series of 4 studies to develop the Interpersonal Emotion Regulation Questionnaire (IERQ). The final scale consists of 20 items with 4 factors containing 5 items each. The 4 factors are: Enhancing Positive Affect; Perspective Taking; Soothing; and Social Modeling. The scale shows excellent psychometric characteristics. Implications for future research are discussed.R01 MH078308 - NIMH NIH HHS; R34 MH086668 - NIMH NIH HHS; R01 AT007257 - NCCIH NIH HHS; R21 MH101567 - NIMH NIH HHS; R34 MH099311 - NIMH NIH HHS; R21 MH102646 - NIMH NIH HHS; K23 MH100259 - NIMH NIH HHS; R01 MH099021 - NIMH NIH HH

    Boredom at work:What, why, and what then?

    Get PDF
    • …
    corecore