22 research outputs found

    Cepstral Peak Prominence Smoothed distribution as discriminator of vocal health in sustained vowel

    Get PDF
    This paper focuses on Cepstral Peak Prominence Smoothed (CPPS) as a possible indicator of vocal health status, considering individual CPPS distribution and its descriptive statistics. 31 voluntary patients and 22 control subjects performed the same protocol, which includes the simultaneous acquisition of three repetitions of the sustained vowel /a/ with a microphone in air and a contact sensor, the perceptual assessment of voice and the videolaringoscopy examination. The best logistic regression models have been applied and preliminary results showed that the fifth percentile and the standard deviation of CPPS distributions are the best parameters that discriminate healthy and unhealthy voice for the microphone in air and the contact sensor, respectively. The Area Under Curve (AUC) revealed the diagnostic precision of the selected CPPS parameters: AUC of 0.96 and 0.83 have been found for the microphone in air and the contact sensor, showing strong to moderate discrimination power, respectively. The repeatability of the selected CPPS parameters has been also estimated. For each selected CPPS parameter, the Monte Carlo method has been implemented in order to evaluate the uncertainty of the threshold, which was identified by means of the Receiver Operating Curve analysis

    Towards vocal-behaviour and vocal-health assessment using distributions of acoustic parameters

    Get PDF
    Voice disorders at different levels are affecting those professional categories that make use of voice in a sustained way and for prolonged periods of time, the so-called occupational voice users. In-field voice monitoring is needed to investigate voice behaviour and vocal health status during everyday activities and to highlight work-related risk factors. The overall aim of this thesis is to contribute to the identification of tools, procedures and requirements related to the voice acoustic analysis as objective measure to prevent voice disorders, but also to assess them and furnish proof of outcomes during voice therapy. The first part of this thesis includes studies on vocal-load related parameters. Experiments were performed both in-field and in laboratory. A one-school year longitudinal study of teachers’ voice use during working hours was performed in high school classrooms using a voice analyzer equipped with a contact sensor; further measurements took place in the semi-anechoic and reverberant rooms of the National Institute of Metrological Research (I.N.Ri.M.) in Torino (Italy) for investigating the effects of very low and excessive reverberation in speech intensity, using both microphones in air and contact sensors. Within this framework, the contributions of the sound pressure level (SPL) uncertainty estimation using different devices were also assessed with proper experiments. Teachers adjusted their voice significantly with noise and reverberation, both at the beginning and at the end of the school year. Moreover, teachers who worked in the worst acoustic conditions showed higher SPLs and a worse vocal health status at the end of the school year. The minimum value of speech SPL was found for teachers in classrooms with a reverberation time of about 0.8 s. Participants involved into the in-laboratory experiments significantly increased their speech intensity of about 2.0 dB in the semi-anechoic room compared with the reverberant room, when describing a map. Such results are related to the speech monitorings performed with the vocal analyzer, whose uncertainty estimation for SPL differences resulted of about 1 dB. The second part of this thesis was addressed to vocal health and voice quality assessment using different speech materials and devices. Experiments were performed in clinics, in collaboration with the Department of Surgical Sciences of Università di Torino (Italy) and the Department of Clinical Science, Intervention and Technology of Karolinska Institutet in Stockholm (Sweden). Individual distributions of Cepstral Peak Prominence Smoothed (CPPS) from voluntary patients and control subjects were investigated in sustained vowels, reading, free speech and excerpted vowels from continuous speech, which were acquired with microphones in air and contact sensors. The main influence quantities of the estimated cepstral parameters were also identified, which are the fundamental frequency of the vocalization and the broadband noise superimposed to the signal. In addition, the reliability of CPPS estimation with respect to the frequency content of the vocal spectrum was evaluated, which is mainly dependent on the bandwidth of the measuring chain used to acquire the vocal signal. Regarding the speech materials acquired with the microphone in air, the 5th percentile resulted the best statistic for CPPS distributions that can discriminate healthy and unhealthy voices in sustained vowels, while the 95th percentile was the best in both reading and free speech tasks. The discrimination thresholds were 15 dB (95\% Confidence Interval, CI, of 0.7 dB) and 18 dB (95\% CI of 0.6 dB), respectively, where lower values indicate a high probability to have unhealthy voice. Preliminary outcomes on excerpted vowels from continuous speech stated that a CPPS mean value lower than 14 dB designates pathological voices. CPPS distributions were also effective as proof of outcomes after interventions, e.g. voice therapy and phonosurgery. Concerning the speech materials acquired with the electret contact sensor, a reasonable discrimination power was only obtained in the case of sustained vowel, where the standard deviation of CPPS distribution higher than 1.1 dB (95\% CI of 0.2 dB) indicates a high probability to have unhealthy voice. Further results indicated that a reliable estimation of CPPS parameters is obtained provided that the frequency content of the spectrum is not lower than 5 kHz: such outcome provides a guideline on the bandwidth of the measuring chain used to acquire the vocal signal

    Relationship Between Laryngeal Signs and Symptoms, Acoustic Measures, and Quality of Life in Finnish Primary and Kindergarten School Teachers

    Full text link
    Objective This study investigated the relationship between the acoustic measure smoothed cepstral peak prominence (CPPS), teacher's quality of life as measured by the voice activity and participation profile (VAPP), laryngeal signs and symptoms, voice related health problems and laryngoscopic findings in Finnish teachers. The relationship between CPPS and sound pressure level (SPL) was also assessed. Methods Vowel and text samples from 183 healthy Finnish teachers (99 kindergarten teachers [KT] and 84 primary school teachers [PST]) were analyzed for CPPS. Text reading was recorded in conversational loudness by PST, and KT were recorded wearing headphones, while listening to a masking noise of children talking to simulate their classroom voice and environment. CPPS values were correlated with the VAPP, self-reported laryngeal signs and symptoms, voice related health variables, and laryngoscopic findings. Results There was a significant difference between the two groups for CPPS text, PST showed significantly lower CPPS values (10.44) than KT (11.52). There was no difference between the two groups for CPPS vowel phonation. There was a significant correlation between SPL text and CPPS text for KT (P < 0.001, r = 0.43) but not for PST (P < 0.10, r = 0.16). There was a significant correlation between SPL vowel and CPPS vowel for both PST (P < 0.001, r = 0.47) and KT (P < 0.001, r = 0.45). CPPS did not correlate with the VAPP, laryngeal signs and symptoms, health variables or laryngeal findings. Factorial analysis of variance resulted in a significant relationship between the VAPP, laryngeal signs and symptoms, and teacher type. Teacher type and symptoms had a significant effect on VAPP scores. Conclusions In the present work CPPS does not correlate with vocal health indicators of functionally healthy teachers. CPPS was significantly influenced by differences in speaking voice SPL, emphasizing the impact of recording conditions and technique. There was a significant relationship between laryngeal signs and symptoms, teacher type and the VAPP. Laryngeal signs and symptoms and teacher type are important variables and should be included in the clinical evaluation of occupational voice users, and voice problems

    Discriminating Pathological Voice From Healthy Voice Using Cepstral Peak Prominence Smoothed Distribution in Sustained Vowel

    Get PDF
    This paper deals with cepstral peak prominence smoothed (CPPS) distribution and its descriptive statistics as possible indicators of vocal health status. A total of 41 voluntary patients and 35 control subjects participated in the experiment: all of them followed the same protocol, which includes three repetitions of the sustained vowel /a/ simultaneously acquired with a microphone in air and a contact sensor, the perceptual assessment of voice quality, and the videolaringoscopy examination. The fifth percentile and the standard deviation of CPPS distribution were the parameters included in the best logistic regression models for the microphone in air and the contact sensor, respectively. The selected CPPS parameters had a strong to good discrimination power: an area under curve of 0.95 and 0.87 has been found for the microphone in air and for the contact sensor, respectively. For each CPPS parameter, the repeatability has been also estimated and the Monte Carlo method has been implemented for the uncertainty evaluation of the discrimination threshold. Furthermore, preliminary recommendations for better accuracy and repeatability of future studies are provided: analyses on the main CPPS influence quantities and on the effect of the frequency content of the signal spectrum on the CPPS parameters have been provided

    Acoustic measurement of overall voice quality in sustained vowels and continuous speech

    Get PDF
    Measurement of dysphonia severity involves auditory-perceptual evaluations and acoustic analyses of sound waves. Meta-analysis of proportional associations between these two methods showed that many popular perturbation metrics and noise-to-harmonics and others ratios do not yield reasonable results. However, this meta-analysis demonstrated that the validity of specific autocorrelation- and cepstrum-based measures was much more convincing, and appointed ‘smoothed cepstral peak prominence’ as the most promising metric of dysphonia severity. Original research confirmed this inferiority of perturbation measures and superiority of cepstral indices in dysphonia measurement of laryngeal-vocal and tracheoesophageal voice samples. However, to be truly representative for daily voice use patterns, measurement of overall voice quality is ideally founded on the analysis of sustained vowels ánd continuous speech. A customized method for including both sample types and calculating the multivariate Acoustic Voice Quality Index (i.e., AVQI), was constructed for this purpose. Original study of the AVQI revealed acceptable results in terms of initial concurrent validity, diagnostic precision, internal and external cross-validity and responsiveness to change. It thus was concluded that the AVQI can track changes in dysphonia severity across the voice therapy process. There are many freely and commercially available computer programs and systems for acoustic metrics of dysphonia severity. We investigated agreements and differences between two commonly available programs (i.e., Praat and Multi-Dimensional Voice Program) and systems. The results indicated that clinicians better not compare frequency perturbation data across systems and programs and amplitude perturbation data across systems. Finally, acoustic information can also be utilized as a biofeedback modality during voice exercises. Based on a systematic literature review, it was cautiously concluded that acoustic biofeedback can be a valuable tool in the treatment of phonatory disorders. When applied with caution, acoustic algorithms (particularly cepstrum-based measures and AVQI) have merited a special role in assessment and/or treatment of dysphonia severity

    Acoustic and Perceptual Effects of Mask-Wearing on Voice and Communication in Healthcare Practitioners

    Get PDF
    The purpose of the present study aims to determine the perceptual and acoustic voice effects experienced by healthcare practitioners following prolonged use of face masks during the COVID-19 pandemic. A total of 19 participants were recruited and divided into control and experimental groups. Of these 19, 10 were assigned to the experimental group (E) which required participants to engage in increasing their fluid intake by 1 liter per day, while the remaining 9 were part of the control group (C) and had no additional instructions. To gather perceptual data, a survey was created and conducted via Qualtrics software, and addressed voice perception, mask use, and demographic information. To identify acoustic measure changes, pre and post work week, recordings were collected, analyzed, and compared using PRAAT software. Additionally, comparisons between the acoustic measures of subjects in the control and experimental groups was completed to determine discrepancies that may provide insight on hydration and its role in vocal discomfort. Data is currently being analyzed due to just completing data acquisition; however, it is hypothesized that the results of this study will indicate increased perceptions of vocal discomfort across all participants as per survey responses. Comparison of each subject’s pre and post work week recordings are also hypothesized to express acoustic measure differences correlating with higher instances of vocal fatigue post work week due to prolonged voice use. Lastly, comparisons between the participants in the experimental and control group is expected to indicate less vocal discomfort as a result of increased fluid intake—a common treatment method implemented for patients experiencing vocal discomfort.Findings of this study aim to expand on the current literature regarding perceptual and acoustic effects of masks in the voice of a population that requires 1. extensive use of their voice and 2. has experienced higher requirements of mask use to decrease transmission of infection. Both variables were noted to be exacerbated by the current pandemic which has been shown to impact and increase the demand for quality care by healthcare practitioners

    VOCAL BIOMARKERS OF CLINICAL DEPRESSION: WORKING TOWARDS AN INTEGRATED MODEL OF DEPRESSION AND SPEECH

    Get PDF
    Speech output has long been considered a sensitive marker of a person’s mental state. It has been previously examined as a possible biomarker for diagnosis and treatment response for certain mental health conditions, including clinical depression. To date, it has been difficult to draw robust conclusions from past results due to diversity in samples, speech material, investigated parameters, and analytical methods. Within this exploratory study of speech in clinically depressed individuals, articulatory and phonatory behaviours are examined in relation to psychomotor symptom profiles and overall symptom severity. A systematic review provided context from the existing body of knowledge on the effects of depression on speech, and provided context for experimental setup within this body of work. Examinations of vowel space, monophthong, and diphthong productions as well as a multivariate acoustic analysis of other speech parameters (e.g., F0 range, perturbation measures, composite measures, etc.) are undertaken with the goal of creating a working model of the effects of depression on speech. Initial results demonstrate that overall vowel space area was not different between depressed and healthy speakers, but on closer inspection, this was due to more specific deficits seen in depressed patients along the first formant (F1) axis. Speakers with depression were more likely to produce centralised vowels along F1, as compared to F2—and this was more pronounced for low-front vowels, which are more complex given the degree of tongue-jaw coupling required for production. This pattern was seen in both monophthong and diphthong productions. Other articulatory and phonatory measures were inspected in a factor analysis as well, suggesting additional vocal biomarkers for consideration in diagnosis and treatment assessment of depression—including aperiodicity measures (e.g., higher shimmer and jitter), changes in spectral slope and tilt, and additive noise measures such as increased harmonics-to-noise ratio. Intonation was also affected by diagnostic status, but only for specific speech tasks. These results suggest that laryngeal and articulatory control is reduced by depression. Findings support the clinical utility of combining Ellgring and Scherer’s (1996) psychomotor retardation and social-emotional hypotheses to explain the effects of depression on speech, which suggest observed changes are due to a combination of cognitive, psycho-physiological and motoric mechanisms. Ultimately, depressive speech is able to be modelled along a continuum of hypo- to hyper-speech, where depressed individuals are able to assess communicative situations, assess speech requirements, and then engage in the minimum amount of motoric output necessary to convey their message. As speakers fluctuate with depressive symptoms throughout the course of their disorder, they move along the hypo-hyper-speech continuum and their speech is impacted accordingly. Recommendations for future clinical investigations of the effects of depression on speech are also presented, including suggestions for recording and reporting standards. Results contribute towards cross-disciplinary research into speech analysis between the fields of psychiatry, computer science, and speech science

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Modelling loudness: Acoustic and perceptual correlates in the context of hypophonia in Parkinson’s disease

    Get PDF
    Hypophonia (quiet speech) is a common speech symptom associated with Parkinson’s disease (PD), and is associated with reduced intelligibility, communicative effectiveness, and communicative participation. Studies of hypophonia commonly employ average speech intensity as the primary dependent measure, which may not entirely capture loudness deficits. Loudness may also be affected by the frequency components of speech (i.e. spectral balance) and speech level variability. The present investigation examined relationships between perceived loudness and intelligibility with acoustic measures of loudness, speech intensity, and spectral distribution in individuals with hypophonia secondary to Parkinson’s disease (IWPDs) and neurologically healthy older adults (HOAs). Samples of sentence reading and conversational speech from 56 IWPDs and 46 HOAs were presented to listeners for ratings of perceived loudness and intelligibility. Listeners provided ratings of loudness using visual analogue scales (VAS) and direct magnitude estimation (DME). Acoustic measures of speech level (e.g. mean intensity), spectral balance (e.g. spectral tilt), and speech level variability (e.g. standard deviation of intensity) were obtained for comparison with perceived characteristics. In a spectral manipulation experiment, a gain adjustment altered the spectral balance of sentence samples while maintaining equal mean intensity. Listeners provided VAS ratings of perceived loudness of these manipulated samples. IWPDs were quieter, less intelligible, and had a relatively greater concentration of low-frequency energy than HOAs. Speech samples with weaker contributions of mid- (2-5 kHz) and high-frequency (5-8 kHz) energy were perceived as quieter. Results of the spectral manipulation experiment indicated that increases in the relative contribution of 2-10 kHz energy were associated with increases in perceived loudness. The acoustic time-varying loudness model (TVL) demonstrated stronger associations with perceived loudness and larger differences between IWPDs and HOAs, and successfully identified differences in loudness in the spectral manipulation experiment. Loudness ratings provided with VAS and DME were consistent, both providing excellent reliability. Findings of this investigation indicate that perceived loudness, acoustic loudness, and spectral balance are important components of hypophonia evaluation. Incorporating spectral manipulation in amplification by increasing high-frequency energy may improve efficacy of amplification devices for hypophonia management
    corecore