49 research outputs found

    Pan European Voice Conference - PEVOC 11

    Get PDF
    The Pan European VOice Conference (PEVOC) was born in 1995 and therefore in 2015 it celebrates the 20th anniversary of its establishment: an important milestone that clearly expresses the strength and interest of the scientific community for the topics of this conference. The most significant themes of PEVOC are singing pedagogy and art, but also occupational voice disorders, neurology, rehabilitation, image and video analysis. PEVOC takes place in different European cities every two years (www.pevoc.org). The PEVOC 11 conference includes a symposium of the Collegium Medicorum Theatri (www.comet collegium.com

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    An investigation into glottal waveform based speech coding

    Get PDF
    Coding of voiced speech by extraction of the glottal waveform has shown promise in improving the efficiency of speech coding systems. This thesis describes an investigation into the performance of such a system. The effect of reverberation on the radiation impedance at the lips is shown to be negligible under normal conditions. Also, the accuracy of the Image Method for adding artificial reverberation to anechoic speech recordings is established. A new algorithm, Pre-emphasised Maximum Likelihood Epoch Detection (PMLED), for Glottal Closure Instant detection is proposed. The algorithm is tested on natural speech and is shown to be both accurate and robust. Two techniques for giottai waveform estimation, Closed Phase Inverse Filtering (CPIF) and Iterative Adaptive Inverse Filtering (IAIF), are compared. In tandem with an LF model fitting procedure, both techniques display a high degree of accuracy However, IAIF is found to be slightly more robust. Based on these results, a Glottal Excited Linear Predictive (GELP) coding system for voiced speech is proposed and tested. Using a differential LF parameter quantisation scheme, the system achieves speech quality similar to that of U S Federal Standard 1016 CELP at a lower mean bit rate while incurring no extra delay

    Numerical Study of Laryngeal Control of Phonation using Realistic Finite Element Models of a Canine Larynx

    Get PDF
    While many may take it for granted, the human voice is an incredible feat. An average person can produce a great variety of voices and change voice characteristics agilely even without formal training. Last several decades of research has established that the production of voice is largely a mechanical process: i.e., the sustained vibration of the vocal folds driven by the glottal air flow. Since one only has a single pair of vocal folds, the versatility comes with the ability to change the mechanical status of the vocal folds, including vocal fold length and thickness, tension, and level of adduction, through activation of the laryngeal muscles. However, the relationship between laryngeal muscle activity and the characteristics of voice is not well understood due to limitations in experimental observation and simplifications in modelling and simulations. The science is still far behind the art. The current research aims to investigate first the relationship between laryngeal muscle activation and the posture of the vocal folds and second the relationship between voice source characteristics and vocal fold mechanical status using more comprehensive numerical models and simulations, thus improving the understanding of the roles of each laryngeal muscle in voice control. To do so, (1) the mechanics involved in vocal fold posturing and vibration, especially muscle contraction; (2) the realistic anatomical structure of the larynx must be considered properly. To achieve this goal, a numerical model of the larynx as realistic as possible was built. The geometry of the laryngeal components was reconstructed from high resolution MRI (Magnetic Resonance Imaging) data of an excised canine larynx, which makes more accurate the representation of the muscles and their sub-compartments, cartilages, and other important anatomical features of the larynx. A previously proposed muscle activation model was implemented in a 3D finite element package and applied to the larynx model to simulate the action of laryngeal muscles. After validation of the numerical model against experimental data, extensive parametric studies involving different combination of muscle activations were conducted to investigate how the voice source is controlled with laryngeal muscles. In the course of this study, some work was done to couple the same finite element tool with a Genetic Algorithm program to inversely determine model parameters in biomechanical models. The method was applied in a collaborated study on shape changes of a fish fin during swimming. This study is presented as a separate chapter at the end of this thesis. The method has potential application in determining parameters in vocal fold models and optimizing clinical vocal fold procedures. This thesis is essentially an assembly of the papers published by the author during the doctoral study, with the addition of an introductory chapter. Chapter 1 reviews the overall principles of voice production, the biomechanical basis of voice control, and past studies on voice control with a focus on the fundamental frequency. Chapter 2 describes the major numerical methods employed in this research with an emphasis on the finite element method. The muscle activation model is also described in this chapter. Chapter 3 describes the building of the larynx model from MRI data and its partial validation. Chapter 4 presents the application of the larynx model to posturing studies, including parametric activation of muscle groups and specific topics related to vocal fold posturing. Chapter 5 describes the change of vocal fold vibration dynamics under the influence of the interaction of the cricothyroid muscle and the thyroarytenoid muscle. The Flow-structure interaction simulations was realized by coupling the larynx model to a simple Bernoulli flow model and a two-stage simulation technique. Chapter 6 concludes the current thesis study. Suggestions for future studies are proposed. Chapter 7 is an independent study that is not related to voice control. It describes a numerical framework that inversely determines and validates model parameters of biomechanical models. The application of the proposed framework to a finite element model of a fish fin is presented

    A feasibility study of visual feedback speech therapy for nasal speech associated with velopharyngeal dysfunction

    No full text
    Nasal speech associated with velopharyngeal dysfunction (VPD) is seen in children and adults with cleft palate and other conditions that affect soft palate function, with negative effects on quality of life. Treatment options include surgery and prosthetics depending on the nature of the problem. Speech therapy is rarely offered as an alternative treatment as evidence from previous studies is weak. However there is evidence that visual biofeedback approaches are beneficial in other speech disorders and that this approach could benefit individuals with nasal speech who demonstrate potential for improved speech. Theories of learning and feedback also lend support to the view that a combined feedback approach would be most suitable. This feasibility study therefore aimed to develop and evaluate Visual Feedback Therapy (VFTh), a new behavioural speech therapy intervention, incorporating speech activities supported by visual biofeedback and performance feedback, for individuals with mild to moderate nasal speech. Evaluation included perceptual, instrumental and quality of life measures. Eighteen individuals with nasal speech were recruited from a regional cleft palate centre and twelve completed the study, six female and six male, eleven children (7 to 13 years) and one adult, (43 years). Six participants had repaired cleft palate and six had VPD but no cleft. Participants received 8 sessions of VFTh from one therapist. The findings suggest that that the intervention is feasible but some changes are required, including participant screening for adverse response and minimising disruptions to intervention scheduling. In blinded evaluation there was considerable variation in individual results but positive changes occurred in at least one speech symptom between pre and post-intervention assessment for eight participants. Seven participants also showed improved nasalance scores and seven had improved quality of life scores. This small study has provided important information about the feasibility of delivering and evaluating VFTh. It suggests that VFTh shows promise as an alternative treatment option for nasal speech but that further preliminary development and evaluation is required before larger scale research is indicated

    Registers in Singing. Empirical and Systematic Studies in the Theory of the Singing Voice

    Get PDF

    A comparison of voice quality following radiotherapy or transoral laser microsurgery of T1a laryngeal carcinomas

    Get PDF
    Introduction: Patients with laryngeal carcinoma often present early due to the change in their voice. The treatment for T1aN0M0 carcinoma varies throughout the world, but whether radiotherapy (RT) or endolaryngeal laser excision is performed both result in excellent local control of the tumour and five year survival rates. There are advantages and disadvantages of either treatment but there are no appropriately powered randomised controlled trials comparing them. Over recent decades external beam RT has become the more popular choice and this is partly due to a perception of poor voice outcomes from surgical excision. However with the development of technology allowing surgical precision, transoral laser microsurgery (TLM) has resulted in low morbidity and good voice outcomes. Objective: This research has three main objectives: a. To describe acoustic parameters of ‘normal’ voice; b. To compare voice outcomes in patients treated with TLM with those treated with radiotherapy for T1a SCC of the glottis; c. To investigate longitudinal changes in voice quality in patients undergoing TLM for T1a SCC of the glottis. Methods: The research was divided into three main parts. The first part was to analyse the acoustic parameters of ‘normal’ voice. To describe the parameters of ‘normal’ voice, adults with no history of voice disorders who scored zero on the voice questionnaire (Voice Handicap Index - 10) were included. The second part comprised a comparative cohort study of 40 patients with T1aN0M0 laryngeal carcinoma, treated with either TLM (20 patients) or RT (20 patients) to compare voice outcomes at least one year following treatment. The third part involved a prospective cohort study of 30 patients with T1aN0M0 laryngeal carcinomas who were treated with TLM, comparing voice qualities before and after treatment. All patients were recruited from those attending the regional Head and Neck centre in Aintree University Hospital. The same methodology was adopted for voice recordings for all three parts of the study. Participants were asked to read a phonetically balanced passage and produce a prolonged vowel sound. In a sound proof room the voice recording included simultaneous audio and electrolaryngograph readings. The voice recordings were scored according to the GRBAS voice scale by an experienced rater. Acoustic analysis was performed form the electrolaryngograph recording using the SpeechStudioTM software. Several objective acoustic parameters were calculated from both sustained vowels and connected speech. These include: fundamental frequency (Fx), jitter, shimmer, harmonics to noise ratio (HNR) and normalized noise energy (NNE). In the comparative study of TLM versus RT and the prospective TLM study, patients were asked to complete voice-specific and quality of life questionnaires. The voice-specific questionnaires were the Voice Symptom Scale (VoiSS) and the Voice Handicap Index-10 (VHI-10). The quality of life questionnaire adopted was the University of Washington Quality of Life (UWQoL) version 4. Results: In the acoustic analysis of sustained vowels in normal speech, females have a statistically significantly higher Fx than males (adjusted p=<0.05). There is no other statistically significant difference across the domains for sustained vowels in normal speech. In the analysis of connected speech, Fx is again higher in females (p<0.001). There is no statistically significant difference in amplitude (Ax) or contact quotient (Qx). In the comparison of voice post TLM and RT, there is no statistical difference in voice-specific questionnaires between the groups. The UW-QoL4 found a statistically significantly higher QoL score in the TLM compared with the RT group for appearance (p=0.003), recreation (p=0.048), chewing (p=0.015) and saliva (p=0.016), however these are not statistically significant when adjusted for age. Overall for QoL, the RT group have a statistically significantly lower median score compared to TLM in physical function (p=0.004) and this remains statistically significant when adjusted for age (p=0.036). There is no statistically significant difference for social function (p=0.441). There is no statistically significant difference in perceptual rating (GRBAS score) between RT and TLM groups (total mean 5.49 vs. 5.12, p=0.254). Most domains as part of the acoustic analysis of sustained vowels show no statistically significant difference between RT and TLM. The mean Fx analysis on connective speech is statistically significantly higher in the TLM group (161.2Hz vs. 131.1Hz, adjusted p=0.001). Coherence of frequency is statistically significantly higher in the TLM group (48.6% vs. 36.0%, adjusted p=0.027) and pitch irregularity is statistically significantly higher in the RT group (26.7% vs. 14.9%, adjusted p=0.013). There is no statistically significant difference in mean amplitude between the two groups. Coherence of amplitude is statistically significantly higher in the TLM group (adjusted p=0.006) and amplitude irregularity is statistically significantly higher in the RT group, (12.4% vs. 6.3%, adjusted p=0.005). There is no statistically significant difference in mean contact quotient (p=0.368), coherence (p=0.236) or irregularity (p=0.125) when comparing TLM and RT. In the comparison of voice pre and post TLM, there is no statistical difference in voice-specific questionnaires between the groups. There is no statistically significant difference in the UW-QOLv4 domain scores or composite scores in patients pre- and post- TLM. There was no statistically significant difference in mean score for ‘G’,’R’,’B’ and ‘S’ indicators as part of perceptual rating between pre and post TLM patients, although asthenia was statistically significantly lower post-TLM (0.97 vs. 0.94, adjusted p=0.015). There is no statistically significant difference in any of the domains in the acoustic analysis of sustained vowels pre and post TLM. In the acoustic analysis of connected speech, the mean DFx is statistically significantly higher in the post TLM group (adjusted p=0.001). There is no statistically significant difference in the coherence of frequency or pitch irregularity when comparing pre and post TLM. There is no statistically significant difference in the mean DAx (p=0.121), coherence (p=0.472) or irregularity of amplitude (p=0.184) when comparing pre and post TLM. There is no statistically significant difference in the mean DQx (adjusted p=0.904), coherence (adjusted p=0.293) or irregularity of the contact quotient (adjusted p=0.400) when comparing pre and post TLM. Conclusion: The treatment of T1a laryngeal carcinoma with either TLM or RT has been shown to have comparably good local control. There are advantages and disadvantages of both treatments, however TLM is often preferred by patient and clinician as it is a day case procedure, can provide histological clearance and leaves the option to use RT in the future. However voice outcomes of the procedures have been debated with various reports in the literature. There are challenges when comparing the two treatment modalities due to a number of tumour, patient and surgical factors. It is not surprising that the voice is affected by whatever treatment is performed to treat the glottic carcinoma. This study shows that voice quality is good, however it is measured, for after both TLM and RT

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy
    corecore