418 research outputs found

    Automatic analysis of speech F0 contour for the characterization of mood changes in bipolar patients

    Get PDF
    da inserireBipolar disorders are characterized by a mood swing, ranging from mania to depression. A system that could monitor and eventually predict these changes would be useful to improve therapy and avoid dangerous events. Speech might convey relevant information about subjects' mood and there is a growing interest to study its changes in presence of mood disorders. In this work we present an automatic method to characterize fundamental frequency (F0) dynamics in voiced part of syllables. The method performs a segmentation of voiced sounds from running speech samples and estimates two categories of features. The first category is borrowed from Taylor's Tilt intonational model. However, the meaning of the proposed features is different from the meaning of Taylor's ones since the former are estimated from all voiced segments without performing any analysis of intonation. A second category of features takes into account the speed of change of F0. In this work, the proposed features are first estimated from an emotional speech database. Then, an analysis on speech samples acquired from eleven psychiatric patients experiencing different mood states, and eighteen healthy control subjects is introduced. Subjects had to perform a text reading task and a picture commenting task. The results of the analysis on the emotional speech database indicate that the proposed features can discriminate between high and low arousal emotions. This was verified both at single subject and group level. An intra-subject analysis was performed on bipolar patients and it highlighted significant changes of the features with different mood states, although this was not observed for all the subjects. The directions of the changes estimated for different patients experiencing the same mood swing, were not coherent and were task-dependent. Interestingly, a single-subject analysis performed on healthy controls and on bipolar patients recorded twice with the same mood label, resulted in a very small number of significant differences. In particular a very good specificity was highlighted for the Taylor-inspired features and for a subset of the second category of features, thus strengthening the significance of the results obtained with patients. Even if the number of enrolled patients is small, this work suggests that the proposed features might give a relevant contribution to the demanding research field of speech-based mood classifiers. Moreover, the results here presented indicate that a model of speech changes in bipolar patients might be subject-specific and that a richer characterization of subject status could be necessary to explain the observed variability

    Objective methods for reliable detection of concealed depression

    Get PDF
    Recent research has shown that it is possible to automatically detect clinical depression from audio-visual recordings. Before considering integration in a clinical pathway, a key question that must be asked is whether such systems can be easily fooled. This work explores the potential of acoustic features to detect clinical depression in adults both when acting normally and when asked to conceal their depression. Nine adults diagnosed with mild to moderate depression as per the Beck Depression Inventory (BDI-II) and Patient Health Questionnaire (PHQ-9) were asked a series of questions and to read a excerpt from a novel aloud under two different experimental conditions. In one, participants were asked to act naturally and in the other, to suppress anything that they felt would be indicative of their depression. Acoustic features were then extracted from this data and analysed using paired t-tests to determine any statistically significant differences between healthy and depressed participants. Most features that were found to be significantly different during normal behaviour remained so during concealed behaviour. In leave-one-subject-out automatic classification studies of the 9 depressed subjects and 8 matched healthy controls, an 88% classification accuracy and 89% sensitivity was achieved. Results remained relatively robust during concealed behaviour, with classifiers trained on only non-concealed data achieving 81% detection accuracy and 75% sensitivity when tested on concealed data. These results indicate there is good potential to build deception-proof automatic depression monitoring systems

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Objective methods for reliable detection of concealed depression

    Get PDF
    Recent research has shown that it is possible to automatically detect clinical depression from audio-visual recordings. Before considering integration in a clinical pathway, a key question that must be asked is whether such systems can be easily fooled. This work explores the potential of acoustic features to detect clinical depression in adults both when acting normally and when asked to conceal their depression. Nine adults diagnosed with mild to moderate depression as per the Beck Depression Inventory (BDI-II) and Patient Health Questionnaire (PHQ-9) were asked a series of questions and to read a excerpt from a novel aloud under two different experimental conditions. In one, participants were asked to act naturally and in the other, to suppress anything that they felt would be indicative of their depression. Acoustic features were then extracted from this data and analysed using paired t-tests to determine any statistically significant differences between healthy and depressed participants. Most features that were found to be significantly different during normal behaviour remained so during concealed behaviour. In leave-one-subject-out automatic classification studies of the 9 depressed subjects and 8 matched healthy controls, an 88% classification accuracy and 89% sensitivity was achieved. Results remained relatively robust during concealed behaviour, with classifiers trained on only non-concealed data achieving 81% detection accuracy and 75% sensitivity when tested on concealed data. These results indicate there is good potential to build deception-proof automatic depression monitoring systems

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

    Get PDF
    Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) and k-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature

    The Impact of Emotion Focused Features on SVM and MLR Models for Depression Detection

    Get PDF
    Major depressive disorder (MDD) is a common mental health diagnosis with estimates upwards of 25% of the United States population remain undiagnosed. Psychomotor symptoms of MDD impacts speed of control of the vocal tract, glottal source features and the rhythm of speech. Speech enables people to perceive the emotion of the speaker and MDD decreases the mood magnitudes expressed by an individual. This study asks the questions: “if high level features deigned to combine acoustic features related to emotion detection are added to glottal source features and mean response time in support vector machines and multivariate logistic regression models, would that improve the recall of the MDD class?” To answer this question, a literature review goes through common features in MDD detection, especially features related to emotion recognition. Using feature transformation, emotion recognition composite features are produced and added to glottal source features for model evaluation

    Detection of clinical depression in adolescents' using acoustic speech analysis

    Get PDF
    Clinical depression is a major risk factor in suicides and is associated with high mortality rates, therefore making it one of the leading causes of death worldwide every year. Symptoms of depression often first appear during adolescence at a time when the voice is changing, in both males and females, suggesting that specific studies of these phenomena in adolescent populations are warranted. The properties of acoustic speech have previously been investigated as possible cues for depression in adults. However, these studies were restricted to small populations of patients and the speech recordings were made during patient’s clinical interviews or fixed-text reading sessions. A collaborative effort with the Oregon research institute (ORI), USA allowed the development of a new speech corpus consisting of a large sample size of 139 adolescents (46 males and 93 females) that were divided into two groups (68 clinically depressed and 71 controls). The speech recordings were made during naturalistic interactions between adolescents and parents. Instead of covering a plethora of acoustic features in the investigation, this study takes the knowledge based from speech science and groups the acoustic features into five categories that relate to the physiological and perceptual areas of the speech production mechanism. These five acoustic feature categories consisted of the prosodic, cepstral, spectral, glottal and Teager energy operator (TEO) based features. The effectiveness in applying these acoustic feature categories in detecting adolescent’s depression was measured. The salient feature categories were determined by testing the feature categories and their combinations within a binary classification framework. In consistency with previous studies, it was observed that: - there are strong gender related differences in classification accuracy; - the glottal features provide an important enhancement of the classification accuracy when combined with other types of features; An important new contribution provided by this thesis was to observe that the TEO based features significantly outperformed prosodic, cepstral, spectral, glottal features and their combinations. An investigation into the possible reasons of such strong performance of the TEO features pointed into the importance of nonlinear mechanisms associated with the glottal flow formation as possible cues for depression
    • …
    corecore