58 research outputs found

    Multimodal analysis of verbal and nonverbal behaviour on the example of clinical depression

    No full text
    Clinical depression is a common mood disorder that may last for long periods, vary in severity, and could impair an individual’s ability to cope with daily life. Depression affects 350 million people worldwide and is therefore considered a burden not only on a personal and social level, but also on an economic one. Depression is the fourth most significant cause of suffering and disability worldwide and it is predicted to be the leading cause in 2020. Although treatment of depression disorders has proven to be effective in most cases, misdiagnosing depressed patients is a common barrier. Not only because depression manifests itself in different ways, but also because clinical interviews and self-reported history are currently the only ways of diagnosis, which risks a range of subjective biases either from the patient report or the clinical judgment. While automatic affective state recognition has become an active research area in the past decade, methods for mood disorder detection, such as depression, are still in their infancy. Using the advancements of affective sensing techniques, the long-term goal is to develop an objective multimodal system that supports clinicians during the diagnosis and monitoring of clinical depression. This dissertation aims to investigate the most promising characteristics of depression that can be “heard” and “seen” by a computer system for the task of detecting depression objectively. Using audio-video recordings of a clinically validated Australian depression dataset, several experiments are conducted to characterise depression-related patterns from verbal and nonverbal cues. Of particular interest in this dissertation is the exploration of speech style, speech prosody, eye activity, and head pose modalities. Statistical analysis and automatic classification of extracted cues are investigated. In addition, multimodal fusion methods of these modalities are examined to increase the accuracy and confidence level of detecting depression. These investigations result in a proposed system that detects depression in a binary manner (e.g. depressed vs. non-depressed) using temporal depression behavioural cues. The proposed system: (1) uses audio-video recordings to investigate verbal and nonverbal modalities, (2) extracts functional features from verbal and nonverbal modalities over the entire subjects’ segments, (3) pre- and post-normalises the extracted features, (4) selects features using the T-test, (5) classifies depression in a binary manner (i.e. severely depressed vs. healthy controls), and finally (6) fuses the individual modalities. The proposed system was validated for scalability and usability using generalisation experiments. Close studies were made of American and German depression datasets individually, and then also in combination with the Australian one. Applying the proposed system to the three datasets showed remarkably high classification results - up to a 95% average recall for the individual sets and 86% for the three combined. Strong implications are that the proposed system has the ability to generalise to different datasets recorded under quite different conditions such as collection procedure and task, depression diagnosis testing and scale, as well as cultural and language background. High performance was found consistently in speech prosody and eye activity in both individual and combined datasets, with head pose features a little less remarkable. Strong indications are that the extracted features are robust to large variations in recording conditions. Furthermore, once the modalities were combined, the classification results improved substantially. Therefore, the modalities are shown both to correlate and complement each other, working in tandem as an innovative system for diagnoses of depression across large variations of population and procedure

    Multimodal region-based behavioral modeling for suicide risk screening

    Get PDF
    IntroductionSuicide is a leading cause of death around the world, interpolating a huge suffering to the families and communities of the individuals. Such pain and suffering are preventable with early screening and monitoring. However, current suicide risk identification relies on self-disclosure and/or the clinician's judgment.Research question/statmentTherefore, we investigate acoustic and nonverbal behavioral markers that are associated with different levels of suicide risks through a multimodal approach for suicide risk detection.Given the differences in the behavioral dynamics between subregions of facial expressions and body gestures in terms of timespans, we propose a novel region-based multimodal fusion.MethodsWe used a newly collected video interview dataset of young Japanese who are at risk of suicide to extract engineered features and deep representations from the speech, regions of the face (i.e., eyes, nose, mouth), regions of the body (i.e., shoulders, arms, legs), as well as the overall combined regions of face and body.ResultsThe results confirmed that behavioral dynamics differs between regions, where some regions benefit from a shorter timespans, while other regions benefit from longer ones. Therefore, a region-based multimodal approach is more informative in terms of behavioral markers and accounts for both subtle and strong behaviors. Our region-based multimodal results outperformed the single modality, reaching a sample-level accuracy of 96% compared with the highest single modality that reached sample-level accuracy of 80%. Interpretation of the behavioral markers, showed the higher the suicide risk levels, the lower the expressivity, movement and energy observed from the subject. Moreover, the high-risk suicide group express more disgust and contact avoidance, while the low-risk suicide group express self-soothing and anxiety behaviors.DiscussionEven though multimodal analysis is a powerful tool to enhance the model performance and its reliability, it is important to ensure through a careful selection that a strong behavioral modality (e.g., body movement) does not dominate another subtle modality (e.g., eye blink). Despite the small sample size, our unique dataset and the current results adds a new cultural dimension to the research on nonverbal markers of suicidal risks. Given a larger dataset, future work on this method can be useful in helping psychiatrists with the assessment of suicide risk and could have several applications to identify those at risk

    Cross cultural detection of depression from nonverbal behaviour

    Get PDF
    Millions of people worldwide suffer from depression. Do commonalities exist in their nonverbal behavior that would enable cross-culturally viable screening and assessment of severity? We investigated the generalisability of an approach to detect depression severity cross-culturally using video-recorded clinical interviews from Australia, the USA and Germany. The material varied in type of interview, subtypes of depression and inclusion healthy control subjects, cultural background, and recording environment. The analysis focussed on temporal features of participants' eye gaze and head pose. Several approaches to training and testing within and between datasets were evaluated. The strongest results were found for training across all datasets and testing across datasets using leave-one-subject-out cross-validation. In contrast, generalisability was attenuated when training on only one or two of the three datasets and testing on subjects from the dataset(s) not used in training. These findings highlight the importance of using training data exhibiting the expected range of variabilit

    A Robotic Positive Psychology Coach to Improve College Students' Wellbeing

    Full text link
    A significant number of college students suffer from mental health issues that impact their physical, social, and occupational outcomes. Various scalable technologies have been proposed in order to mitigate the negative impact of mental health disorders. However, the evaluation for these technologies, if done at all, often reports mixed results on improving users' mental health. We need to better understand the factors that align a user's attributes and needs with technology-based interventions for positive outcomes. In psychotherapy theory, therapeutic alliance and rapport between a therapist and a client is regarded as the basis for therapeutic success. In prior works, social robots have shown the potential to build rapport and a working alliance with users in various settings. In this work, we explore the use of a social robot coach to deliver positive psychology interventions to college students living in on-campus dormitories. We recruited 35 college students to participate in our study and deployed a social robot coach in their room. The robot delivered daily positive psychology sessions among other useful skills like delivering the weather forecast, scheduling reminders, etc. We found a statistically significant improvement in participants' psychological wellbeing, mood, and readiness to change behavior for improved wellbeing after they completed the study. Furthermore, students' personality traits were found to have a significant association with intervention efficacy. Analysis of the post-study interview revealed students' appreciation of the robot's companionship and their concerns for privacy.Comment: 8 pages, 5 figures, RO-MAN 2020, Best paper awar

    Objective methods for reliable detection of concealed depression

    Get PDF
    Recent research has shown that it is possible to automatically detect clinical depression from audio-visual recordings. Before considering integration in a clinical pathway, a key question that must be asked is whether such systems can be easily fooled. This work explores the potential of acoustic features to detect clinical depression in adults both when acting normally and when asked to conceal their depression. Nine adults diagnosed with mild to moderate depression as per the Beck Depression Inventory (BDI-II) and Patient Health Questionnaire (PHQ-9) were asked a series of questions and to read a excerpt from a novel aloud under two different experimental conditions. In one, participants were asked to act naturally and in the other, to suppress anything that they felt would be indicative of their depression. Acoustic features were then extracted from this data and analysed using paired t-tests to determine any statistically significant differences between healthy and depressed participants. Most features that were found to be significantly different during normal behaviour remained so during concealed behaviour. In leave-one-subject-out automatic classification studies of the 9 depressed subjects and 8 matched healthy controls, an 88% classification accuracy and 89% sensitivity was achieved. Results remained relatively robust during concealed behaviour, with classifiers trained on only non-concealed data achieving 81% detection accuracy and 75% sensitivity when tested on concealed data. These results indicate there is good potential to build deception-proof automatic depression monitoring systems

    From joyous to clinically depressed: Mood detection using multimodal analysis of a person's appearance and speech

    No full text
    Clinical depression is a critical public health problem, with high costs associated to a person's functioning, mortality, and social relationships, as well as the economy overall. Currently, there is no dedicated objective method to diagnose depression.

    The Effectiveness of Depth Data in Liveness Face Authentication Using 3D Sensor Cameras

    No full text
    Even though biometric technology increases the security of systems that use it, they are prone to spoof attacks where attempts of fraudulent biometrics are used. To overcome these risks, techniques on detecting liveness of the biometric measure are employed. For example, in systems that utilise face authentication as biometrics, a liveness is assured using an estimation of blood flow, or analysis of quality of the face image. Liveness assurance of the face using real depth technique is rarely used in biometric devices and in the literature, even with the availability of depth datasets. Therefore, this technique of employing 3D cameras for liveness of face authentication is underexplored for its vulnerabilities to spoofing attacks. This research reviews the literature on this aspect and then evaluates the liveness detection to suggest solutions that account for the weaknesses found in detecting spoofing attacks. We conduct a proof-of-concept study to assess the liveness detection of 3D cameras in three devices, where the results show that having more flexibility resulted in achieving a higher rate in detecting spoofing attacks. Nonetheless, it was found that selecting a wide depth range of the 3D camera is important for anti-spoofing security recognition systems such as surveillance cameras used in airports. Therefore, to utilise the depth information and implement techniques that detect faces regardless of the distance, a 3D camera with long maximum depth range (e.g., 20 m) and high resolution stereo cameras could be selected, which can have a positive impact on accuracy

    An exploratory study of detecting emotion states using eye-tracking technology

    No full text
    Studying eye movement has proven to be useful in the study of detecting and understanding human emotional states. This paper aims to investigate eye movement features: pupil size, time of first fixation, first fixation duration, fixation duration and fix

    Conceptualization and development of an autonomous and personalized early literacy content and robot tutor behavior for preschool children

    No full text
    Abstract Personalized learning has a higher impact on students’ progress than traditional approaches. However, current resources required to implement personalization are scarce. This research aims to conceptualize and develop an autonomous robot tutor with personalization policy for preschool children aged between three to five years old. Personalization is performed by automatically adjusting the difficulty level of the lesson delivery and assessment, as well as adjusting the feedback based on the reaction of children. This study explores three child behaviors for the personalization policy: (i) academic knowledge (measured by the correctness of the answer), (ii) executive functioning of attention (measured by the orientation and the gaze direction of child’s body), and (iii) working memory or hesitation (measured by the time lag before the answer). Moreover, this study designed lesson content through interviews with teachers and deployed the personalization interaction policy through the NAO robot with five children in a case user study method. We qualitatively analyze the session observations and parent interviews, as well as quantitatively analyze knowledge gain through pre- and posttests and a parent questionnaire. The findings of the study reveal that the personalized interaction with the robot showed a positive potential in increasing the children’s learning gains and attracting their engagement. As general guidelines based on this pilot study, we identified additional personalization strategies that could be used for autonomous personalization policies based on each child’s behavior, which could have a considerable impact on child learning
    • …
    corecore