133 research outputs found

    Acoustic measurement of overall voice quality in sustained vowels and continuous speech

    Get PDF
    Measurement of dysphonia severity involves auditory-perceptual evaluations and acoustic analyses of sound waves. Meta-analysis of proportional associations between these two methods showed that many popular perturbation metrics and noise-to-harmonics and others ratios do not yield reasonable results. However, this meta-analysis demonstrated that the validity of specific autocorrelation- and cepstrum-based measures was much more convincing, and appointed ‘smoothed cepstral peak prominence’ as the most promising metric of dysphonia severity. Original research confirmed this inferiority of perturbation measures and superiority of cepstral indices in dysphonia measurement of laryngeal-vocal and tracheoesophageal voice samples. However, to be truly representative for daily voice use patterns, measurement of overall voice quality is ideally founded on the analysis of sustained vowels ánd continuous speech. A customized method for including both sample types and calculating the multivariate Acoustic Voice Quality Index (i.e., AVQI), was constructed for this purpose. Original study of the AVQI revealed acceptable results in terms of initial concurrent validity, diagnostic precision, internal and external cross-validity and responsiveness to change. It thus was concluded that the AVQI can track changes in dysphonia severity across the voice therapy process. There are many freely and commercially available computer programs and systems for acoustic metrics of dysphonia severity. We investigated agreements and differences between two commonly available programs (i.e., Praat and Multi-Dimensional Voice Program) and systems. The results indicated that clinicians better not compare frequency perturbation data across systems and programs and amplitude perturbation data across systems. Finally, acoustic information can also be utilized as a biofeedback modality during voice exercises. Based on a systematic literature review, it was cautiously concluded that acoustic biofeedback can be a valuable tool in the treatment of phonatory disorders. When applied with caution, acoustic algorithms (particularly cepstrum-based measures and AVQI) have merited a special role in assessment and/or treatment of dysphonia severity

    Cepstral peak prominence: a comprehensive analysis

    Full text link
    An analytical study of cepstral peak prominence (CPP) is presented, intended to provide an insight into its meaning and relation with voice perturbation parameters. To carry out this analysis, a parametric approach is adopted in which voice production is modelled using the traditional source-filter model and the first cepstral peak is assumed to have Gaussian shape. It is concluded that the meaning of CPP is very similar to that of the first rahmonic and some insights are provided on its dependence with fundamental frequency and vocal tract resonances. It is further shown that CPP integrates measures of voice waveform and periodicity perturbations, be them either amplitude, frequency or noise

    Análisis acústico de la voz: medidas temporales, espectrales y cepstrales en la voz normal con el Praat en una muestra de hablantes de español

    Get PDF
    El análisis acústico es una herramienta que proporciona información objetiva sobre la voz. En los últimos años, medidas del espectro medio a largo plazo (LTAS) y cepstrales, como la prominencia del pico cepstral suavizado (CPPs), han complementado a las medidas utilizadas tradicionalmente, demostrando en multitud de estudios una alta correlación con el grado de severidad de la disfonía. El objetivo de este trabajo descriptivo fue calcular, en el Praat, los valores de normalidad de medidas temporales, espectrales y cepstrales en una muestra de 50 hablantes de español (25 hombres y 25 mujeres) atendiendo a los principales factores que influyen en su fiabilidad como el tipo micrófono, el nivel de ruido ambiental, el programa de análisis y los parámetros acústicos utilizados. Se realizaron dos muestras de voz para cada sujeto: 1) una /a/ sostenida con la que se calcularon la CPPs y los parámetros de la frecuencia fundamental (F0), de ruido y de perturbación de la frecuencia y de la amplitud, y 2) una muestra de habla conectada donde se calcularon la CPPs y las pendientes del LTAS. Los resultados del análisis con la vocal sostenida muestran diferencias significativas en función del sexo en la F0, el jitter absoluto y en todos los parámetros de la perturbación de la amplitud y del ruido. En habla conectada se observan diferencias significativas entre hombres y mujeres en la pendiente espectral obtenida a partir de la línea de tendencia a través del LTAS y en la CPPs

    KLASYFIKACJA CHOROBY PARKINSONA I INNYCH ZABURZEŃ NEUROLOGICZNYCH Z WYKORZYSTANIEM EKSTRAKCJI CECH GŁOSOWYCH I TECHNIK REDUKCJI

    Get PDF
    This study aimed to differentiate individuals with Parkinson's disease (PD) from those with other neurological disorders (ND) by analyzing voice samples, considering the association between voice disorders and PD. Voice samples were collected from 76 participants using different recording devices and conditions, with participants instructed to sustain the vowel /a/ comfortably. PRAAT software was employed to extract features including autocorrelation (AC), cross-correlation (CC), and Mel frequency cepstral coefficients (MFCC) from the voice samples. Principal component analysis (PCA) was utilized to reduce the dimensionality of the features. Classification Tree (CT), Logistic Regression, Naive Bayes (NB), Support Vector Machines (SVM), and Ensemble methods were employed as supervised machine learning techniques for classification. Each method provided distinct strengths and characteristics, facilitating a comprehensive evaluation of their effectiveness in distinguishing PD patients from individuals with other neurological disorders. The Naive Bayes kernel, using seven PCA-derived components, achieved the highest accuracy rate of 86.84% among the tested classification methods. It is worth noting that classifier performance may vary based on the dataset and specific characteristics of the voice samples. In conclusion, this study demonstrated the potential of voice analysis as a diagnostic tool for distinguishing PD patients from individuals with other neurological disorders. By employing a variety of voice analysis techniques and utilizing different machine learning algorithms, including Classification Tree, Logistic Regression, Naive Bayes, Support Vector Machines, and Ensemble methods, a notable accuracy rate was attained. However, further research and validation using larger datasets are required to consolidate and generalize these findings for future clinical applications.Przedstawione badanie miało na celu różnicowanie osób z chorobą Parkinsona (PD) od osób z innymi zaburzeniami neurologicznymi poprzez analizę próbek głosowych, biorąc pod uwagę związek między zaburzeniami głosu a PD. Próbki głosowe zostały zebrane od 76 uczestników przy użyciu różnych urządzeń i warunków nagrywania, a uczestnicy byli instruowani, aby wydłużyć samogłoskę /a/ w wygodnym tempie. Oprogramowanie PRAAT zostało zastosowane do ekstrakcji cech, takich jak autokorelacja (AC), krzyżowa korelacja (CC) i współczynniki cepstralne Mel (MFCC) z próbek głosowych. Analiza składowych głównych (PCA) została wykorzystana w celu zmniejszenia wymiarowości cech. Jako techniki nadzorowanego uczenia maszynowego wykorzystano drzewa decyzyjne (CT), regresję logistyczną, naiwny klasyfikator Bayesa (NB), maszyny wektorów nośnych (SVM) oraz metody zespołowe. Każda z tych metod posiadała swoje unikalne mocne strony i charakterystyki, umożliwiając kompleksową ocenę ich skuteczności w rozróżnianiu pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Naiwny klasyfikator Bayesa, wykorzystujący siedem składowych PCA, osiągnął najwyższy wskaźnik dokładności na poziomie 86,84% wśród przetestowanych metod klasyfikacji. Należy jednak zauważyć, że wydajność klasyfikatora może się różnić w zależności od zbioru danych i konkretnych cech próbek głosowych. Podsumowując, to badanie wykazało potencjał analizy głosu jako narzędzia diagnostycznego do rozróżniania pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Poprzez zastosowanie różnych technik analizy głosu i wykorzystanie różnych algorytmów uczenia maszynowego, takich jak drzewa decyzyjne, regresja logistyczna, naiwny klasyfikator Bayesa, maszyny wektorów nośnych i metody zespołowe, osiągnięto znaczący poziom dokładności. Niemniej jednak, konieczne są dalsze badania i walidacja na większych zbiorach danych w celu skonsolidowania i uogólnienia tych wyników dla przyszłych zastosowań klinicznych

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Phonation Types in Marathi: An Acoustic Investigation

    Get PDF
    This dissertation presents a comprehensive instrumental acoustic analysis of phonation type distinctions in Marathi, an Indic language with numerous breathy voiced sonorants and obstruents. Important new facts about breathy voiced sonorants, which are crosslinguistically rare, are established: male and female speakers cue breathy phonation in sonorants differently, there are an abundance of trading relations, and--critically--phonation type distinctions are not cued as well by sonorants as by obstruents. Ten native speakers (five male, five female) were recorded producing Marathi words embedded in a carrier sentence. Tokens included plain and breathy voiced stops, affricates, nasals, laterals, rhotics, and approximants before the vowels [a] and [e]. Measures reported for consonants and subsequent vowels include duration, F0, Cepstral Peak Prominence (CPP), and corrected H1-H2*, H1-A1*, H1-A2*, and H1-A3* values. As expected, breathy voice is associated with decreased CPP and increased spectral values. A strong gender difference is revealed: low-frequency measures like H1-H2* cue breathy phonation more reliably in male speech, while CPP--which provides information about the aspiration noise included in the signal--is a more reliable cue in female speech. Trading relations are also reported: time and again, where one cue is weak or absent another cue is strong or present, underscoring the importance of including both genders and multiple vowel contexts when testing phonation type differences. Overall, the cues that are present for obstruents are not necessarily mirrored by sonorants. These findings are interpreted with reference to Dispersion Theory (Flemming 1995; Liljencrants & Lindblom 1972; Lindblom 1986, 1990). While various incarnations of Dispersion Theory focus on different aspects of perceptual and auditory distinctiveness, a basic claim is that one requirement for phonological contrasts is that they must be perceptually distinct: contrasts that are subject to great confusability are phonologically disfavored. The proposal, then, is that the typology of breathy voiced sonorants is due in part to the fact that they are not well differentiated acoustically. Breathy voiced sonorants are crosslinguistically rare because they do not make for strong phonemic contrasts

    Acoustic analysis in mild cognitive diagnosis: systematic review of the literature in 2008–2020

    Get PDF
    Introducción: En investigaciones recientes se han descrito cambios en la producción de tono y timbre vocal que ocurren en la edad adulta tardía. Estos cambios indican alteraciones cognitivas tempranas, incluso en etapas preclínicas de deterioro cognitivo. Este estudio tiene como objetivo identificar hallazgos relevantes de la literatura con respecto al análisis acústico en adultos mayores con deterioro cognitivo. Material y métodos: Se realizó un estudio de revisión sistemática, en el que se consultaron las siguientes bases de datos: PlosOne, Science Direct, PubMed/pmc y Google Scholar. Se utilizaron buscadores como análisis acústico, enfermedad de Alzheimer, deterioro cognitivo leve, prosodia, análisis de voz y producción de voz. Adicionalmente, se incluyen artículos empíricos que describen el análisis acústico en adultos mayores con riesgo cognitivo. La evaluación fue realizada de forma independiente por dos evaluadores, quienes determinaron el riesgo de sesgo en la revisión. Se encontraron un total de 59 artículos relacionados con el tema, de los cuales 25 cumplieron con los criterios de inclusión. Resultados: Los artículos revisados ​​identificaron cambios en la prosodia lingüística y paralingüística, el timbre y la tonalidad vocal, que se asocian con el deterioro cognitivo en los adultos mayores. Conclusión: Los protocolos de estudio en el análisis acústico podrían ser una buena herramienta para apoyar el diagnóstico clínico diferencial del deterioro cognitivo en la edad adulta tardía y una buena oportunidad para identificar el riesgo en estadios preclínicos de demencia.Introduction: In recent research, changes in the vocal tone and timbre production that occur in late adulthood have been described. These changes indicate early cognitive disturbances, even in preclini-cal stages of cognitive decline. This study aims to identify relevant findings from the literature regarding acoustic analysis in elderly adults with cognitive impairment. Material and methods: A systematic review study was conducted, in which the following databases were consulted: PlosOne, Science Direct, PubMed/pmc, and Google Scholar. Search engines such as acoustic analysis, Alzheimer’s disease, mild cognitive impairment, prosody, voice analysis, and voice production were used. Additionally, empirical articles describing the acoustic analysis in elderly adults with cognitive risk are included. The evaluation was independently performed by two evaluators, who determined the risk of bias in the review. A total of 59 articles related to the topic were found, of which 25 met the inclusion criteria. Results: The reviewed articles identified changes in linguistic and paralinguistic prosody, timbre, and vocal tonality, which are associated with cognitive decline in the elderly. Conclusion: Study protocols in the acoustic analysis could be a good tool to support the differential clinical diagnosis of cognitive deterioration in late adulthood and a good opportunity to identify the risk in preclinical stages of dementia

    Optimization and automation of relative fundamental frequency for objective assessment of vocal hyperfunction

    Full text link
    The project objective is to improve clinical assessment and diagnosis of the voice disorder, vocal hyperfunction (VH). VH is a condition characterized by excessive laryngeal and paralaryngeal tension, and is assumed to be the underlying cause of the majority of voice disorders. Current clinical assessment of VH is subjective and demonstrates poor inter-rater reliability. Recent work indicates that a new acoustic measure, relative fundamental frequency (RFF) is sensitive to the maladaptive functional behaviors associated with VH and can potentially be used to objectively characterize VH. Here, we explored and enhanced the potential for RFF as a measure of VH in three ways. First, the current protocol for RFF estimation was optimized to simplify the recording procedure and reduce estimation time. Second, RFF was compared with the current state-of-the-art measures of VH – listener perception of vocal effort and the aerodynamic ratio of sound pressure level to subglottal pressure level. Third, an automated algorithm that utilized the optimized recording protocol was developed and validated against manual estimation methods and listener perception. This work enables large-scale studies on RFF to determine the specific physiological elements that contribute to the measure’s ability to capture VH and may potentially provide a non-invasive and readily implemented solution for this long-standing clinical issue

    Voicing quantification is more relevant than period perturbation in substitution voices: an advanced acoustical study

    Get PDF
    Quality of substitution voicing—i.e., phonation with a voice that is not generated by the vibration of two vocal folds—cannot be adequately evaluated with routinely used software for acoustic voice analysis that is aimed at ‘common’ dysphonias and nearly periodic voice signals. The AMPEX analysis program (Van Immerseel and Martens) has been shown previously to be able to detect periodicity in irregular signals with background noise, and to be suited for running speech. The validity of this analysis program is first tested using realistic synthesized voice signals with known levels of cycle-to-cycle perturbations and additive noise. Second, exhaustive acoustic analysis is performed of the voices of 116 patients surgically treated for advanced laryngeal cancer and recorded in seven European academic centers. All of them read out a short phonetically balanced passage. Patients were divided into six groups according to the oscillating structures they used to phonate. Results show that features related to quantification of voicing enable a distinction between the different groups, while the features reporting F0-instability fail to do so. Acoustic evaluation of voice quality in substitution voices thus best relies upon voicing quantification
    corecore