Search CORE

774 research outputs found

Analysis of atypical prosodic patterns in the speech of people with Down syndrome

Author: Cardeñoso Payo Valentín
Corrales Astorgano Mario
Escudero Mancebo David
González Ferreras César
Martínez Castilla Pastora
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Producción CientíficaThe speech of people with Down syndrome (DS) shows prosodic features which are distinct from those observed in the oral productions of typically developing (TD) speakers. Although a different prosodic realization does not necessarily imply wrong expression of prosodic functions, atypical expression may hinder communication skills. The focus of this work is to ascertain whether this can be the case in individuals with DS. To do so, we analyze the acoustic features that better characterize the utterances of speakers with DS when expressing prosodic functions related to emotion, turn-end and phrasal chunking, comparing them with those used by TD speakers. An oral corpus of speech utterances has been recorded using the PEPS-C prosodic competence evaluation tool. We use automatic classifiers to prove that the prosodic features that better predict prosodic functions in TD speakers are less informative in speakers with DS. Although atypical features are observed in speakers with DS when producing prosodic functions, the intended prosodic function can be identified by listeners and, in most cases, the features correctly discriminate the function with analytical methods. However, a greater difference between the minimal pairs presented in the PEPS-C test is found for TD speakers in comparison with DS speakers. The proposed methodological approach provides, on the one hand, an identification of the set of features that distinguish the prosodic productions of DS and TD speakers and, on the other, a set of target features for therapy with speakers with DS.Ministerio de Economía, Industria y Competitividad - Fondo Europeo de Desarrollo Regional (grant TIN2017-88858-C2-1-R)Junta de Castilla y León (grant VA050G18

Repositorio Documental de la Universidad de Valladolid

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

Directory of Open Access Books (DOAB)

Genomics Approaches for Studying Musical Aptitude and Related Traits

Author: Järvelä Irma
Publication venue: Oxford University Press Pakistan
Publication date: 01/10/2018
Field of study

Print Publication Date:Jul 2019Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Big Data analytics to assess personality based on voice analysis

Author: Morales Ramírez Rodrigo
Publication venue
Publication date: 01/01/2021
Field of study

Trabajo Fin de Grado en Ingeniería de Tecnologías y Servicios de TelecomunicaciónWhen humans speak, the produced series of acoustic signs do not encode only the linguistic message they wish to communicate, but also several other types of information about themselves and their states that show glimpses of their personalities and can be apprehended by judgers. As there is nowadays a trend to film job candidate’s interviews, the aim of this Thesis is to explore possible correlations between speech features extracted from interviews and personality characteristics established by experts, and to try to predict in a candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism, Openness to Experience and Extraversion. The features were extracted from a genuine database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a previous study. Even though many significant correlations were found for each years’ dataset, lots of them were proven to be inconsistent through both studies. Only extraversion, and openness in a more limited way, showed a good number of clear correlations. Essentially, extraversion has been found to be related to the variation in the slope of the pitch (usually at the end of sentences), which indicates that a more "singing" voice could be associated with a higher score. In addition, spectral entropy and roll-off measurements have also been found to indicate that larger changes in the spectrum (which may also be related to more "singing" voices) could be associated with greater extraversion too. Regarding predictive modelling algorithms, aimed to estimate personality traits from the speech features obtained for the study, results were observed to be very limited in terms of accuracy and RMSE, and also through scatter plots for regression models and confusion matrixes for classification evaluation. Nevertheless, various results encourage to believe that there are some predicting capabilities, and extraversion and openness also ended up being the most predictable personality traits. Better outcomes were achieved when predictions were performed based on one specific feature instead of all of them or a reduced group, as it was the case for openness when estimated through linear and logistic regression based on time over 90% of the variation range of the deltas from the entropy of the spectrum module. Extraversion too, as it correlates well with features relating variation in F0 decreasing slope and variations in the spectrum. For the predictions, several machine learning algorithms have been used, such as linear regression, logistic regression and random forests

Biblos-e Archivo

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

Directory of Open Access Books (DOAB)

The Role of Compositional Music Therapy in the Treatment of Adults with Bipolar Disorder

Author: Brown Tyese Andrea
Publication venue: Montclair State University Digital Commons
Publication date: 01/05/2015
Field of study

The purpose of this thesis is to expand on a desire to expand our knowledge, and understanding about Bipolar Disorder and its relationship with compositional music therapy as a possible beneficial treatment for this illness. Compositional music therapy is both a music therapy intervention and process in which the client and therapist work together to generate an original, permanent musical model. The music may be instrumental or vocal, of any genre, and may be musically notated as a score, handwritten/types, or recorded (CD, Tape, MP3, etc.) It may incorporate the client’s original song/rap, lyrics, poetry (set to music), or be instrumental-only. This leads me to answer the research question, what are the benefits of compositional music therapy for clients diagnosed with Bipolar Disorder? This question will be further examined in three subordinate questions as follows: 1 .What are the basis and/or rationales for selecting music composition as a method to address client goals for adult patients diagnosed with Bipolar Disorder? 2. How is the compositional music therapy process useful and/or helpful in addressing clinical goals of clients diagnosed with Bipolar Disorder? 3. How is the compositional music therapy product useful and/or helpful in addressing clinical goals of clients diagnosed with Bipolar Disorder? This research study also seeks to answer both the research question and subordinate questions by consulting with various music therapists who have worked directly with adult patients diagnosed with Bipolar Disorder. This survey research examines the benefits of compositional music therapy as differentiating from other forms of music therapy, as well as what is important about the music compositional product and process in adult clients diagnosed with Bipolar Disorder

Montclair State University Digital Commons

주요 우울 장애의 음성 기반 분석: 연속적인 발화의 음향적 변화를 중심으로

Author: 이수빈
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(디지털정보융합전공), 2023. 2. 이교구.Major depressive disorder (commonly referred to as depression) is a common disorder that affects 3.8% of the world's population. Depression stems from various causes, such as genetics, aging, social factors, and abnormalities in the neurotransmitter system; thus, early detection and monitoring are essential. The human voice is considered a representative biomarker for observing depression; accordingly, several studies have developed an automatic depression diagnosis system based on speech. However, constructing a speech corpus is a challenge, studies focus on adults under 60 years of age, and there are insufficient medical hypotheses based on the clinical findings of psychiatrists, limiting the evolution of the medical diagnostic tool. Moreover, the effect of taking antipsychotic drugs on speech characteristics during the treatment phase is overlooked. Thus, this thesis studies a speech-based automatic depression diagnosis system at the semantic level (sentence). First, to analyze depression among the elderly whose emotional changes do not adequately reflect speech characteristics, it developed the mood-induced sentence to build the elderly depression speech corpus and designed an automatic depression diagnosis system for the elderly. Second, it constructed an extrapyramidal symptom speech corpus to investigate the extrapyramidal symptoms, a typical side effect that can appear from an antipsychotic drug overdose. Accordingly, there is a strong correlation between the antipsychotic dose and speech characteristics. The study paved the way for a comprehensive examination of the automatic diagnosis system for depression.주요 우울 장애 즉 흔히 우울증이라고 일컬어지는 기분 장애는 전 세계인 중 3.8%에 달하는 사람들이 겪은바 있는 매우 흔한 질병이다. 유전, 노화, 사회적 요인, 신경전달물질 체계의 이상등 다양한 원인으로 발생하는 우울증은 조기 발견 및 일상 생활에서의 관리가 매우 중요하다고 할 수 있다. 인간의 음성은 우울증을 관찰하기에 대표적인 바이오마커로 여겨져 왔으며, 음성 데이터를 기반으로한 자동 우울증 진단 시스템 개발을 위한 여러 연구들이 진행되어 왔다. 그러나 음성 말뭉치 구축의 어려움과 60세 이하의 성인들에게 초점이 맞추어진 연구, 정신과 의사들의 임상 소견을 바탕으로한 의학적 가설 설정의 미흡등의 한계점을 가지고 있으며, 이는 의료 진단 기구로 발전하는데 한계점이라고 할 수 있다. 또한, 항정신성 약물의 복용이 음성 특징에 미칠 수 있는 영향 또한 간과되고 있다. 본 논문에서는 위의 한계점들을 보완하기 위한 의미론적 수준 (문장 단위)에서의 음성 기반 자동 우울증 진단에 대한 연구를 시행하고자 한다. 우선적으로 감정의 변화가 음성 특징을 잘 반영되지 않는 노인층의 우울증 분석을 위해 감정 발화 문장을 개발하여 노인 우울증 음성 말뭉치를 구축하고, 문장 단위에서의 관찰을 통해 노인 우울증 군에서 감정 문장 발화가 미치는 영향과 감정 전이를 확인할 수 있었으며, 노인층의 자동 우울증 진단 시스템을 설계하였다. 최종적으로 항정신병 약물의 과복용으로 나타날 수 있는 대표적인 부작용인 추체외로 증상을 조사하기 위해 추체외로 증상 음성 말뭉치를 구축하였고, 항정신병 약물의 복용량과 음성 특징간의 상관관계를 분석하여 우울증의 치료 과정에서 항정신병 약물이 음성에 미칠 수 있는 영향에 대해서 조사하였다. 이를 통해 주요 우울 장애의 영역에 대한 포괄적인 연구를 진행하였다.Chapter 1 Introduction 1 1.1 Research Motivations 3 1.1.1 Bridging the Gap Between Clinical View and Engineering 3 1.1.2 Limitations of Conventional Depressed Speech Corpora 4 1.1.3 Lack of Studies on Depression Among the Elderly 4 1.1.4 Depression Analysis on Semantic Level 6 1.1.5 How Antipsychotic Drug Affects the Human Voice? 7 1.2 Thesis objectives 9 1.3 Outline of the thesis 10 Chapter 2 Theoretical Background 13 2.1 Clinical View of Major Depressive Disorder 13 2.1.1 Types of Depression 14 2.1.2 Major Causes of Depression 15 2.1.3 Symptoms of Depression 17 2.1.4 Diagnosis of Depression 17 2.2 Objective Diagnostic Markers of Depression 19 2.3 Speech in Mental Disorder 19 2.4 Speech Production and Depression 21 2.5 Automatic Depression Diagnostic System 23 2.5.1 Acoustic Feature Representation 24 2.5.2 Classification / Prediction 27 Chapter 3 Developing Sentences for New Depressed Speech Corpus 31 3.1 Introduction 31 3.2 Building Depressed Speech Corpus 32 3.2.1 Elements of Speech Corpus Production 32 3.2.2 Conventional Depressed Speech Corpora 35 3.2.3 Factors Affecting Depressed Speech Characteristics 39 3.3 Motivations 40 3.3.1 Limitations of Conventional Depressed Speech Corpora 40 3.3.2 Attitude of Subjects to Depression: Masked Depression 43 3.3.3 Emotions in Reading 45 3.3.4 Objectives of this Chapter 45 3.4 Proposed Methods 46 3.4.1 Selection of Words 46 3.4.2 Structure of Sentence 47 3.5 Results 49 3.5.1 Mood-Inducing Sentences (MIS) 49 3.5.2 Neutral Sentences for Extrapyramidal Symptom Analysis 49 3.6 Summary 51 Chapter 4 Screening Depression in The Elderly 52 4.1 Introduction 52 4.2 Korean Elderly Depressive Speech Corpus 55 4.2.1 Participants 55 4.2.2 Recording Procedure 57 4.2.3 Recording Specification 58 4.3 Proposed Methods 59 4.3.1 Voice-based Screening Algorithm for Depression 59 4.3.2 Extraction of Acoustic Features 59 4.3.3 Feature Selection System and Distance Computation 62 4.3.4 Classification and Statistical Analyses 63 4.4 Results 65 4.5 Discussion 69 4.6 Summary 74 Chapter 5 Correlation Analysis of Antipsychotic Dose and Speech Characteristics 75 5.1 Introduction 75 5.2 Korean Extrapyramidal Symptoms Speech Corpus 78 5.2.1 Participants 78 5.2.2 Recording Process 79 5.2.3 Extrapyramidal Symptoms Annotation and Equivalent Dose Calculations 80 5.3 Proposed Methods 81 5.3.1 Acoustic Feature Extraction 81 5.3.2 Speech Characteristics Analysis recording to Eq.dose 83 5.4 Results 83 5.5 Discussion 87 5.6 Summary 90 Chapter 6 Conclusions and Future Work 91 6.1 Conclusions 91 6.2 Future work 95 Bibliography 97 초 록 121박

SNU Open Repository and Archive

Automated screening methods for mental and neuro-developmental disorders

Author: Syed Mohammed
Publication venue
Publication date
Field of study

Mental and neuro-developmental disorders such as depression, bipolar disorder, and autism spectrum disorder (ASD) are critical healthcare issues which affect a large number of people. Depression, according to the World Health Organisation, is the largest cause of disability worldwide and affects more than 300 million people. Bipolar disorder affects more than 60 million individuals worldwide. ASD, meanwhile, affects more than 1 in 100 people in the UK. Not only do these disorders adversely affect the quality of life of affected individuals, they also have a significant economic impact. While brute-force approaches are potentially useful for learning new features which could be representative of these disorders, such approaches may not be best suited for developing robust screening methods. This is due to a myriad of confounding factors, such as the age, gender, cultural background, and socio-economic status, which can affect social signals of individuals in a similar way as the symptoms of these disorders. Brute-force approaches may learn to exploit effects of these confounding factors on social signals in place of effects due to mental and neuro-developmental disorders. The main objective of this thesis is to develop, investigate, and propose computational methods to screen for mental and neuro-developmental disorders in accordance with descriptions given in the Diagnostic and Statistical Manual (DSM). The DSM manual is a guidebook published by the American Psychiatric Association which offers common language on mental disorders. Our motivation is to alleviate, to an extent, the possibility of machine learning algorithms picking up one of the confounding factors to optimise performance for the dataset – something which we do not find uncommon in research literature. To this end, we introduce three new methods for automated screening for depression from audio/visual recordings, namely: turbulence features, craniofacial movement features, and Fisher Vector based representation of speech spectra. We surmise that psychomotor changes due to depression lead to uniqueness in an individual's speech pattern which manifest as sudden and erratic changes in speech feature contours. The efficacy of these features is demonstrated as part of our solution to Audio/Visual Emotion Challenge 2017 (AVEC 2017) on Depression severity prediction. We also detail a methodology to quantify specific craniofacial movements, which we hypothesised could be indicative of psychomotor retardation, and hence depression. The efficacy of craniofacial movement features is demonstrated using datasets from the 2014 and 2017 editions of AVEC Depression severity prediction challenges. Finally, using the dataset provided as part of AVEC 2016 Depression classification challenge, we demonstrate that differences between speech of individuals with and without depression can be quantified effectively using the Fisher Vector representation of speech spectra. For our work on automated screening of bipolar disorder, we propose methods to classify individuals with bipolar disorder into states of remission, hypo-mania, and mania. Here, we surmise that like depression, individuals with different levels of mania have certain uniqueness to their social signals. Based on this understanding, we propose the use of turbulence features for audio/visual social signals (i.e. speech and facial expressions). We also propose the use of Fisher Vectors to create a unified representation of speech in terms of prosody, voice quality, and speech spectra. These methods have been proposed as part of our solution to the AVEC 2018 Bipolar disorder challenge. In addition, we find that the task of automated screening for ASD is much more complicated. Here, confounding factors can easily overwhelm socials signals which are affected by ASD. We discuss, in the light of research literature and our experimental analysis, that significant collaborative work is required between computer scientists and clinicians to discern social signals which are robust to common confounding factors

Online Research @ Cardiff