774 research outputs found

    Analysis of atypical prosodic patterns in the speech of people with Down syndrome

    Get PDF
    Producciรณn CientรญficaThe speech of people with Down syndrome (DS) shows prosodic features which are distinct from those observed in the oral productions of typically developing (TD) speakers. Although a different prosodic realization does not necessarily imply wrong expression of prosodic functions, atypical expression may hinder communication skills. The focus of this work is to ascertain whether this can be the case in individuals with DS. To do so, we analyze the acoustic features that better characterize the utterances of speakers with DS when expressing prosodic functions related to emotion, turn-end and phrasal chunking, comparing them with those used by TD speakers. An oral corpus of speech utterances has been recorded using the PEPS-C prosodic competence evaluation tool. We use automatic classifiers to prove that the prosodic features that better predict prosodic functions in TD speakers are less informative in speakers with DS. Although atypical features are observed in speakers with DS when producing prosodic functions, the intended prosodic function can be identified by listeners and, in most cases, the features correctly discriminate the function with analytical methods. However, a greater difference between the minimal pairs presented in the PEPS-C test is found for TD speakers in comparison with DS speakers. The proposed methodological approach provides, on the one hand, an identification of the set of features that distinguish the prosodic productions of DS and TD speakers and, on the other, a set of target features for therapy with speakers with DS.Ministerio de Economรญa, Industria y Competitividad - Fondo Europeo de Desarrollo Regional (grant TIN2017-88858-C2-1-R)Junta de Castilla y Leรณn (grant VA050G18

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Genomics Approaches for Studying Musical Aptitude and Related Traits

    Get PDF
    Print Publication Date:Jul 2019Peer reviewe

    Big Data analytics to assess personality based on voice analysis

    Full text link
    Trabajo Fin de Grado en Ingenierรญa de Tecnologรญas y Servicios de TelecomunicaciรณnWhen humans speak, the produced series of acoustic signs do not encode only the linguistic message they wish to communicate, but also several other types of information about themselves and their states that show glimpses of their personalities and can be apprehended by judgers. As there is nowadays a trend to film job candidateโ€™s interviews, the aim of this Thesis is to explore possible correlations between speech features extracted from interviews and personality characteristics established by experts, and to try to predict in a candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism, Openness to Experience and Extraversion. The features were extracted from a genuine database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a previous study. Even though many significant correlations were found for each yearsโ€™ dataset, lots of them were proven to be inconsistent through both studies. Only extraversion, and openness in a more limited way, showed a good number of clear correlations. Essentially, extraversion has been found to be related to the variation in the slope of the pitch (usually at the end of sentences), which indicates that a more "singing" voice could be associated with a higher score. In addition, spectral entropy and roll-off measurements have also been found to indicate that larger changes in the spectrum (which may also be related to more "singing" voices) could be associated with greater extraversion too. Regarding predictive modelling algorithms, aimed to estimate personality traits from the speech features obtained for the study, results were observed to be very limited in terms of accuracy and RMSE, and also through scatter plots for regression models and confusion matrixes for classification evaluation. Nevertheless, various results encourage to believe that there are some predicting capabilities, and extraversion and openness also ended up being the most predictable personality traits. Better outcomes were achieved when predictions were performed based on one specific feature instead of all of them or a reduced group, as it was the case for openness when estimated through linear and logistic regression based on time over 90% of the variation range of the deltas from the entropy of the spectrum module. Extraversion too, as it correlates well with features relating variation in F0 decreasing slope and variations in the spectrum. For the predictions, several machine learning algorithms have been used, such as linear regression, logistic regression and random forests

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    The Role of Compositional Music Therapy in the Treatment of Adults with Bipolar Disorder

    Get PDF
    The purpose of this thesis is to expand on a desire to expand our knowledge, and understanding about Bipolar Disorder and its relationship with compositional music therapy as a possible beneficial treatment for this illness. Compositional music therapy is both a music therapy intervention and process in which the client and therapist work together to generate an original, permanent musical model. The music may be instrumental or vocal, of any genre, and may be musically notated as a score, handwritten/types, or recorded (CD, Tape, MP3, etc.) It may incorporate the clientโ€™s original song/rap, lyrics, poetry (set to music), or be instrumental-only. This leads me to answer the research question, what are the benefits of compositional music therapy for clients diagnosed with Bipolar Disorder? This question will be further examined in three subordinate questions as follows: 1 .What are the basis and/or rationales for selecting music composition as a method to address client goals for adult patients diagnosed with Bipolar Disorder? 2. How is the compositional music therapy process useful and/or helpful in addressing clinical goals of clients diagnosed with Bipolar Disorder? 3. How is the compositional music therapy product useful and/or helpful in addressing clinical goals of clients diagnosed with Bipolar Disorder? This research study also seeks to answer both the research question and subordinate questions by consulting with various music therapists who have worked directly with adult patients diagnosed with Bipolar Disorder. This survey research examines the benefits of compositional music therapy as differentiating from other forms of music therapy, as well as what is important about the music compositional product and process in adult clients diagnosed with Bipolar Disorder

    ์ฃผ์š” ์šฐ์šธ ์žฅ์• ์˜ ์Œ์„ฑ ๊ธฐ๋ฐ˜ ๋ถ„์„: ์—ฐ์†์ ์ธ ๋ฐœํ™”์˜ ์Œํ–ฅ์  ๋ณ€ํ™”๋ฅผ ์ค‘์‹ฌ์œผ๋กœ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์œตํ•ฉ๊ณผํ•™๊ธฐ์ˆ ๋Œ€ํ•™์› ์œตํ•ฉ๊ณผํ•™๋ถ€(๋””์ง€ํ„ธ์ •๋ณด์œตํ•ฉ์ „๊ณต), 2023. 2. ์ด๊ต๊ตฌ.Major depressive disorder (commonly referred to as depression) is a common disorder that affects 3.8% of the world's population. Depression stems from various causes, such as genetics, aging, social factors, and abnormalities in the neurotransmitter system; thus, early detection and monitoring are essential. The human voice is considered a representative biomarker for observing depression; accordingly, several studies have developed an automatic depression diagnosis system based on speech. However, constructing a speech corpus is a challenge, studies focus on adults under 60 years of age, and there are insufficient medical hypotheses based on the clinical findings of psychiatrists, limiting the evolution of the medical diagnostic tool. Moreover, the effect of taking antipsychotic drugs on speech characteristics during the treatment phase is overlooked. Thus, this thesis studies a speech-based automatic depression diagnosis system at the semantic level (sentence). First, to analyze depression among the elderly whose emotional changes do not adequately reflect speech characteristics, it developed the mood-induced sentence to build the elderly depression speech corpus and designed an automatic depression diagnosis system for the elderly. Second, it constructed an extrapyramidal symptom speech corpus to investigate the extrapyramidal symptoms, a typical side effect that can appear from an antipsychotic drug overdose. Accordingly, there is a strong correlation between the antipsychotic dose and speech characteristics. The study paved the way for a comprehensive examination of the automatic diagnosis system for depression.์ฃผ์š” ์šฐ์šธ ์žฅ์•  ์ฆ‰ ํ”ํžˆ ์šฐ์šธ์ฆ์ด๋ผ๊ณ  ์ผ์ปฌ์–ด์ง€๋Š” ๊ธฐ๋ถ„ ์žฅ์• ๋Š” ์ „ ์„ธ๊ณ„์ธ ์ค‘ 3.8%์— ๋‹ฌํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์ด ๊ฒช์€๋ฐ” ์žˆ๋Š” ๋งค์šฐ ํ”ํ•œ ์งˆ๋ณ‘์ด๋‹ค. ์œ ์ „, ๋…ธํ™”, ์‚ฌํšŒ์  ์š”์ธ, ์‹ ๊ฒฝ์ „๋‹ฌ๋ฌผ์งˆ ์ฒด๊ณ„์˜ ์ด์ƒ๋“ฑ ๋‹ค์–‘ํ•œ ์›์ธ์œผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ์šฐ์šธ์ฆ์€ ์กฐ๊ธฐ ๋ฐœ๊ฒฌ ๋ฐ ์ผ์ƒ ์ƒํ™œ์—์„œ์˜ ๊ด€๋ฆฌ๊ฐ€ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ธ๊ฐ„์˜ ์Œ์„ฑ์€ ์šฐ์šธ์ฆ์„ ๊ด€์ฐฐํ•˜๊ธฐ์— ๋Œ€ํ‘œ์ ์ธ ๋ฐ”์ด์˜ค๋งˆ์ปค๋กœ ์—ฌ๊ฒจ์ ธ ์™”์œผ๋ฉฐ, ์Œ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ ์ž๋™ ์šฐ์šธ์ฆ ์ง„๋‹จ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ์„ ์œ„ํ•œ ์—ฌ๋Ÿฌ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋˜์–ด ์™”๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์Œ์„ฑ ๋ง๋ญ‰์น˜ ๊ตฌ์ถ•์˜ ์–ด๋ ค์›€๊ณผ 60์„ธ ์ดํ•˜์˜ ์„ฑ์ธ๋“ค์—๊ฒŒ ์ดˆ์ ์ด ๋งž์ถ”์–ด์ง„ ์—ฐ๊ตฌ, ์ •์‹ ๊ณผ ์˜์‚ฌ๋“ค์˜ ์ž„์ƒ ์†Œ๊ฒฌ์„ ๋ฐ”ํƒ•์œผ๋กœํ•œ ์˜ํ•™์  ๊ฐ€์„ค ์„ค์ •์˜ ๋ฏธํก๋“ฑ์˜ ํ•œ๊ณ„์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ์ด๋Š” ์˜๋ฃŒ ์ง„๋‹จ ๊ธฐ๊ตฌ๋กœ ๋ฐœ์ „ํ•˜๋Š”๋ฐ ํ•œ๊ณ„์ ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ํ•ญ์ •์‹ ์„ฑ ์•ฝ๋ฌผ์˜ ๋ณต์šฉ์ด ์Œ์„ฑ ํŠน์ง•์— ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ์˜ํ–ฅ ๋˜ํ•œ ๊ฐ„๊ณผ๋˜๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์œ„์˜ ํ•œ๊ณ„์ ๋“ค์„ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•œ ์˜๋ฏธ๋ก ์  ์ˆ˜์ค€ (๋ฌธ์žฅ ๋‹จ์œ„)์—์„œ์˜ ์Œ์„ฑ ๊ธฐ๋ฐ˜ ์ž๋™ ์šฐ์šธ์ฆ ์ง„๋‹จ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋ฅผ ์‹œํ–‰ํ•˜๊ณ ์ž ํ•œ๋‹ค. ์šฐ์„ ์ ์œผ๋กœ ๊ฐ์ •์˜ ๋ณ€ํ™”๊ฐ€ ์Œ์„ฑ ํŠน์ง•์„ ์ž˜ ๋ฐ˜์˜๋˜์ง€ ์•Š๋Š” ๋…ธ์ธ์ธต์˜ ์šฐ์šธ์ฆ ๋ถ„์„์„ ์œ„ํ•ด ๊ฐ์ • ๋ฐœํ™” ๋ฌธ์žฅ์„ ๊ฐœ๋ฐœํ•˜์—ฌ ๋…ธ์ธ ์šฐ์šธ์ฆ ์Œ์„ฑ ๋ง๋ญ‰์น˜๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ , ๋ฌธ์žฅ ๋‹จ์œ„์—์„œ์˜ ๊ด€์ฐฐ์„ ํ†ตํ•ด ๋…ธ์ธ ์šฐ์šธ์ฆ ๊ตฐ์—์„œ ๊ฐ์ • ๋ฌธ์žฅ ๋ฐœํ™”๊ฐ€ ๋ฏธ์น˜๋Š” ์˜ํ–ฅ๊ณผ ๊ฐ์ • ์ „์ด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ๋…ธ์ธ์ธต์˜ ์ž๋™ ์šฐ์šธ์ฆ ์ง„๋‹จ ์‹œ์Šคํ…œ์„ ์„ค๊ณ„ํ•˜์˜€๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ํ•ญ์ •์‹ ๋ณ‘ ์•ฝ๋ฌผ์˜ ๊ณผ๋ณต์šฉ์œผ๋กœ ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ถ€์ž‘์šฉ์ธ ์ถ”์ฒด์™ธ๋กœ ์ฆ์ƒ์„ ์กฐ์‚ฌํ•˜๊ธฐ ์œ„ํ•ด ์ถ”์ฒด์™ธ๋กœ ์ฆ์ƒ ์Œ์„ฑ ๋ง๋ญ‰์น˜๋ฅผ ๊ตฌ์ถ•ํ•˜์˜€๊ณ , ํ•ญ์ •์‹ ๋ณ‘ ์•ฝ๋ฌผ์˜ ๋ณต์šฉ๋Ÿ‰๊ณผ ์Œ์„ฑ ํŠน์ง•๊ฐ„์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ถ„์„ํ•˜์—ฌ ์šฐ์šธ์ฆ์˜ ์น˜๋ฃŒ ๊ณผ์ •์—์„œ ํ•ญ์ •์‹ ๋ณ‘ ์•ฝ๋ฌผ์ด ์Œ์„ฑ์— ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ์˜ํ–ฅ์— ๋Œ€ํ•ด์„œ ์กฐ์‚ฌํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ฃผ์š” ์šฐ์šธ ์žฅ์• ์˜ ์˜์—ญ์— ๋Œ€ํ•œ ํฌ๊ด„์ ์ธ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค.Chapter 1 Introduction 1 1.1 Research Motivations 3 1.1.1 Bridging the Gap Between Clinical View and Engineering 3 1.1.2 Limitations of Conventional Depressed Speech Corpora 4 1.1.3 Lack of Studies on Depression Among the Elderly 4 1.1.4 Depression Analysis on Semantic Level 6 1.1.5 How Antipsychotic Drug Affects the Human Voice? 7 1.2 Thesis objectives 9 1.3 Outline of the thesis 10 Chapter 2 Theoretical Background 13 2.1 Clinical View of Major Depressive Disorder 13 2.1.1 Types of Depression 14 2.1.2 Major Causes of Depression 15 2.1.3 Symptoms of Depression 17 2.1.4 Diagnosis of Depression 17 2.2 Objective Diagnostic Markers of Depression 19 2.3 Speech in Mental Disorder 19 2.4 Speech Production and Depression 21 2.5 Automatic Depression Diagnostic System 23 2.5.1 Acoustic Feature Representation 24 2.5.2 Classification / Prediction 27 Chapter 3 Developing Sentences for New Depressed Speech Corpus 31 3.1 Introduction 31 3.2 Building Depressed Speech Corpus 32 3.2.1 Elements of Speech Corpus Production 32 3.2.2 Conventional Depressed Speech Corpora 35 3.2.3 Factors Affecting Depressed Speech Characteristics 39 3.3 Motivations 40 3.3.1 Limitations of Conventional Depressed Speech Corpora 40 3.3.2 Attitude of Subjects to Depression: Masked Depression 43 3.3.3 Emotions in Reading 45 3.3.4 Objectives of this Chapter 45 3.4 Proposed Methods 46 3.4.1 Selection of Words 46 3.4.2 Structure of Sentence 47 3.5 Results 49 3.5.1 Mood-Inducing Sentences (MIS) 49 3.5.2 Neutral Sentences for Extrapyramidal Symptom Analysis 49 3.6 Summary 51 Chapter 4 Screening Depression in The Elderly 52 4.1 Introduction 52 4.2 Korean Elderly Depressive Speech Corpus 55 4.2.1 Participants 55 4.2.2 Recording Procedure 57 4.2.3 Recording Specification 58 4.3 Proposed Methods 59 4.3.1 Voice-based Screening Algorithm for Depression 59 4.3.2 Extraction of Acoustic Features 59 4.3.3 Feature Selection System and Distance Computation 62 4.3.4 Classification and Statistical Analyses 63 4.4 Results 65 4.5 Discussion 69 4.6 Summary 74 Chapter 5 Correlation Analysis of Antipsychotic Dose and Speech Characteristics 75 5.1 Introduction 75 5.2 Korean Extrapyramidal Symptoms Speech Corpus 78 5.2.1 Participants 78 5.2.2 Recording Process 79 5.2.3 Extrapyramidal Symptoms Annotation and Equivalent Dose Calculations 80 5.3 Proposed Methods 81 5.3.1 Acoustic Feature Extraction 81 5.3.2 Speech Characteristics Analysis recording to Eq.dose 83 5.4 Results 83 5.5 Discussion 87 5.6 Summary 90 Chapter 6 Conclusions and Future Work 91 6.1 Conclusions 91 6.2 Future work 95 Bibliography 97 ์ดˆ ๋ก 121๋ฐ•

    Automated screening methods for mental and neuro-developmental disorders

    Get PDF
    Mental and neuro-developmental disorders such as depression, bipolar disorder, and autism spectrum disorder (ASD) are critical healthcare issues which affect a large number of people. Depression, according to the World Health Organisation, is the largest cause of disability worldwide and affects more than 300 million people. Bipolar disorder affects more than 60 million individuals worldwide. ASD, meanwhile, affects more than 1 in 100 people in the UK. Not only do these disorders adversely affect the quality of life of affected individuals, they also have a significant economic impact. While brute-force approaches are potentially useful for learning new features which could be representative of these disorders, such approaches may not be best suited for developing robust screening methods. This is due to a myriad of confounding factors, such as the age, gender, cultural background, and socio-economic status, which can affect social signals of individuals in a similar way as the symptoms of these disorders. Brute-force approaches may learn to exploit effects of these confounding factors on social signals in place of effects due to mental and neuro-developmental disorders. The main objective of this thesis is to develop, investigate, and propose computational methods to screen for mental and neuro-developmental disorders in accordance with descriptions given in the Diagnostic and Statistical Manual (DSM). The DSM manual is a guidebook published by the American Psychiatric Association which offers common language on mental disorders. Our motivation is to alleviate, to an extent, the possibility of machine learning algorithms picking up one of the confounding factors to optimise performance for the dataset โ€“ something which we do not find uncommon in research literature. To this end, we introduce three new methods for automated screening for depression from audio/visual recordings, namely: turbulence features, craniofacial movement features, and Fisher Vector based representation of speech spectra. We surmise that psychomotor changes due to depression lead to uniqueness in an individual's speech pattern which manifest as sudden and erratic changes in speech feature contours. The efficacy of these features is demonstrated as part of our solution to Audio/Visual Emotion Challenge 2017 (AVEC 2017) on Depression severity prediction. We also detail a methodology to quantify specific craniofacial movements, which we hypothesised could be indicative of psychomotor retardation, and hence depression. The efficacy of craniofacial movement features is demonstrated using datasets from the 2014 and 2017 editions of AVEC Depression severity prediction challenges. Finally, using the dataset provided as part of AVEC 2016 Depression classification challenge, we demonstrate that differences between speech of individuals with and without depression can be quantified effectively using the Fisher Vector representation of speech spectra. For our work on automated screening of bipolar disorder, we propose methods to classify individuals with bipolar disorder into states of remission, hypo-mania, and mania. Here, we surmise that like depression, individuals with different levels of mania have certain uniqueness to their social signals. Based on this understanding, we propose the use of turbulence features for audio/visual social signals (i.e. speech and facial expressions). We also propose the use of Fisher Vectors to create a unified representation of speech in terms of prosody, voice quality, and speech spectra. These methods have been proposed as part of our solution to the AVEC 2018 Bipolar disorder challenge. In addition, we find that the task of automated screening for ASD is much more complicated. Here, confounding factors can easily overwhelm socials signals which are affected by ASD. We discuss, in the light of research literature and our experimental analysis, that significant collaborative work is required between computer scientists and clinicians to discern social signals which are robust to common confounding factors
    • โ€ฆ
    corecore