1,609 research outputs found

    Artificial Intelligence for Suicide Assessment using Audiovisual Cues: A Review

    Get PDF
    Death by suicide is the seventh leading death cause worldwide. The recent advancement in Artificial Intelligence (AI), specifically AI applications in image and voice processing, has created a promising opportunity to revolutionize suicide risk assessment. Subsequently, we have witnessed fast-growing literature of research that applies AI to extract audiovisual non-verbal cues for mental illness assessment. However, the majority of the recent works focus on depression, despite the evident difference between depression symptoms and suicidal behavior and non-verbal cues. This paper reviews recent works that study suicide ideation and suicide behavior detection through audiovisual feature analysis, mainly suicidal voice/speech acoustic features analysis and suicidal visual cues. Automatic suicide assessment is a promising research direction that is still in the early stages. Accordingly, there is a lack of large datasets that can be used to train machine learning and deep learning models proven to be effective in other, similar tasks.Comment: Manuscript submitted to Arificial Intelligence Reviews (2022

    Robust Modeling of Epistemic Mental States

    Full text link
    This work identifies and advances some research challenges in the analysis of facial features and their temporal dynamics with epistemic mental states in dyadic conversations. Epistemic states are: Agreement, Concentration, Thoughtful, Certain, and Interest. In this paper, we perform a number of statistical analyses and simulations to identify the relationship between facial features and epistemic states. Non-linear relations are found to be more prevalent, while temporal features derived from original facial features have demonstrated a strong correlation with intensity changes. Then, we propose a novel prediction framework that takes facial features and their nonlinear relation scores as input and predict different epistemic states in videos. The prediction of epistemic states is boosted when the classification of emotion changing regions such as rising, falling, or steady-state are incorporated with the temporal features. The proposed predictive models can predict the epistemic states with significantly improved accuracy: correlation coefficient (CoERR) for Agreement is 0.827, for Concentration 0.901, for Thoughtful 0.794, for Certain 0.854, and for Interest 0.913.Comment: Accepted for Publication in Multimedia Tools and Application, Special Issue: Socio-Affective Technologie

    Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling

    Full text link
    Human personality is significantly represented by those words which he/she uses in his/her speech or writing. As a consequence of spreading the information infrastructures (specifically the Internet and social media), human communications have reformed notably from face to face communication. Generally, Automatic Personality Prediction (or Perception) (APP) is the automated forecasting of the personality on different types of human generated/exchanged contents (like text, speech, image, video, etc.). The major objective of this study is to enhance the accuracy of APP from the text. To this end, we suggest five new APP methods including term frequency vector-based, ontology-based, enriched ontology-based, latent semantic analysis (LSA)-based, and deep learning-based (BiLSTM) methods. These methods as the base ones, contribute to each other to enhance the APP accuracy through ensemble modeling (stacking) based on a hierarchical attention network (HAN) as the meta-model. The results show that ensemble modeling enhances the accuracy of APP

    Empathy Detection Using Machine Learning on Text, Audiovisual, Audio or Physiological Signals

    Full text link
    Empathy is a social skill that indicates an individual's ability to understand others. Over the past few years, empathy has drawn attention from various disciplines, including but not limited to Affective Computing, Cognitive Science and Psychology. Empathy is a context-dependent term; thus, detecting or recognising empathy has potential applications in society, healthcare and education. Despite being a broad and overlapping topic, the avenue of empathy detection studies leveraging Machine Learning remains underexplored from a holistic literature perspective. To this end, we systematically collect and screen 801 papers from 10 well-known databases and analyse the selected 54 papers. We group the papers based on input modalities of empathy detection systems, i.e., text, audiovisual, audio and physiological signals. We examine modality-specific pre-processing and network architecture design protocols, popular dataset descriptions and availability details, and evaluation protocols. We further discuss the potential applications, deployment challenges and research gaps in the Affective Computing-based empathy domain, which can facilitate new avenues of exploration. We believe that our work is a stepping stone to developing a privacy-preserving and unbiased empathic system inclusive of culture, diversity and multilingualism that can be deployed in practice to enhance the overall well-being of human life

    The conflict escalation resolution (CONFER) database

    Get PDF
    Conflict is usually defined as a high level of disagreement taking place when individuals act on incompatible goals, interests, or intentions. Research in human sciences has recognized conflict as one of the main dimensions along which an interaction is perceived and assessed. Hence, automatic estimation of conflict intensity in naturalistic conversations would be a valuable tool for the advancement of human-centered computing and the deployment of novel applications for social skills enhancement including conflict management and negotiation. However, machine analysis of conflict is still limited to just a few works, partially due to an overall lack of suitable annotated data, while it has been mostly approached as a conflict or (dis)agreement detection problem based on audio features only. In this work, we aim to overcome the aforementioned limitations by a) presenting the Conflict Escalation Resolution (CONFER) Database, a set of excerpts from audiovisual recordings of televised political debates where conflicts naturally arise, and b)reporting baseline experiments on audiovisual conflict intensity estimation. The database contains approximately 142min of recordings in Greek language, split over 120 non-overlapping episodes of naturalistic conversations that involve two or three interactants. Subject- and session-independent experiments are conducted on continuous-time (frame-by-frame) estimation of real-valued conflict intensity, as opposed to binary conflict/non-conflict classification. For the problem at hand, the efficiency of various audio and visual features and fusion of them as well as various regression frameworks is examined. Experimental results suggest that there is much room for improvement in the design and development of automated multi-modal approaches to continuous conflict analysis. The CONFER Database is publicly available for non-commercial use at http://ibug.doc.ic.ac.uk/resources/confer/. The Conflict Escalation Resolution (CONFER) Database is presented.CONFER contains 142min (120 episodes) of recordings in Greek language.Episodes are extracted from TV political debates where conflicts naturally arise.Experiments are the first approach to continuous estimation of conflict intensity.Performance of various audio and visual features and classifiers is evaluated

    Stressful first impressions in job interviews

    Get PDF
    Stress can impact many aspects of our lives, such as the way we interact and work with others, or the first impressions that we make. In the past, stress has been most commonly assessed through self-reported questionnaires; however, advancements in wearable technology have enabled the measurement of physiological symptoms of stress in an unobtrusive manner. Using a dataset of job interviews, we investigate whether first impressions of stress (from annotations) are equivalent to physiological measurements of the electrodermal activity (EDA). We examine the use of automatically extracted nonverbal cues stemming from both the visual and audio modalities, as well EDA stress measurements for the inference of stress impressions obtained from manual annotations. Stress impressions were found to be significantly negatively correlated with hireability ratings i.e individuals who were perceived to be more stressed were more likely to obtained lower hireability scores. The analysis revealed a significant relationship between audio and visual features but low predictability and no significant effects were found for the EDA features. While some nonverbal cues were more clearly related to stress, the physiological cues were less reliable and warrant further investigation into the use of wearable sensors for stress detection

    The conflict escalation resolution (CONFER) database

    Get PDF
    Conflict is usually defined as a high level of disagreement taking place when individuals act on incompatible goals, interests, or intentions. Research in human sciences has recognized conflict as one of the main dimensions along which an interaction is perceived and assessed. Hence, automatic estimation of conflict intensity in naturalistic conversations would be a valuable tool for the advancement of human-centered computing and the deployment of novel applications for social skills enhancement including conflict management and negotiation. However, machine analysis of conflict is still limited to just a few works, partially due to an overall lack of suitable annotated data, while it has been mostly approached as a conflict or (dis)agreement detection problem based on audio features only. In this work, we aim to overcome the aforementioned limitations by a) presenting the Conflict Escalation Resolution (CONFER) Database, a set of excerpts from audiovisual recordings of televised political debates where conflicts naturally arise, and b)reporting baseline experiments on audiovisual conflict intensity estimation. The database contains approximately 142min of recordings in Greek language, split over 120 non-overlapping episodes of naturalistic conversations that involve two or three interactants. Subject- and session-independent experiments are conducted on continuous-time (frame-by-frame) estimation of real-valued conflict intensity, as opposed to binary conflict/non-conflict classification. For the problem at hand, the efficiency of various audio and visual features and fusion of them as well as various regression frameworks is examined. Experimental results suggest that there is much room for improvement in the design and development of automated multi-modal approaches to continuous conflict analysis. The CONFER Database is publicly available for non-commercial use at http://ibug.doc.ic.ac.uk/resources/confer/. The Conflict Escalation Resolution (CONFER) Database is presented.CONFER contains 142min (120 episodes) of recordings in Greek language.Episodes are extracted from TV political debates where conflicts naturally arise.Experiments are the first approach to continuous estimation of conflict intensity.Performance of various audio and visual features and classifiers is evaluated
    corecore