Search CORE

133 research outputs found

Acoustic measurement of overall voice quality in sustained vowels and continuous speech

Author: Maryn Youri
Publication venue: Ghent University. Faculty of Medicine and Health Sciences
Publication date: 01/01/2010
Field of study

Measurement of dysphonia severity involves auditory-perceptual evaluations and acoustic analyses of sound waves. Meta-analysis of proportional associations between these two methods showed that many popular perturbation metrics and noise-to-harmonics and others ratios do not yield reasonable results. However, this meta-analysis demonstrated that the validity of specific autocorrelation- and cepstrum-based measures was much more convincing, and appointed ‘smoothed cepstral peak prominence’ as the most promising metric of dysphonia severity. Original research confirmed this inferiority of perturbation measures and superiority of cepstral indices in dysphonia measurement of laryngeal-vocal and tracheoesophageal voice samples. However, to be truly representative for daily voice use patterns, measurement of overall voice quality is ideally founded on the analysis of sustained vowels ánd continuous speech. A customized method for including both sample types and calculating the multivariate Acoustic Voice Quality Index (i.e., AVQI), was constructed for this purpose. Original study of the AVQI revealed acceptable results in terms of initial concurrent validity, diagnostic precision, internal and external cross-validity and responsiveness to change. It thus was concluded that the AVQI can track changes in dysphonia severity across the voice therapy process. There are many freely and commercially available computer programs and systems for acoustic metrics of dysphonia severity. We investigated agreements and differences between two commonly available programs (i.e., Praat and Multi-Dimensional Voice Program) and systems. The results indicated that clinicians better not compare frequency perturbation data across systems and programs and amplitude perturbation data across systems. Finally, acoustic information can also be utilized as a biofeedback modality during voice exercises. Based on a systematic literature review, it was cautiously concluded that acoustic biofeedback can be a valuable tool in the treatment of phonatory disorders. When applied with caution, acoustic algorithms (particularly cepstrum-based measures and AVQI) have merited a special role in assessment and/or treatment of dysphonia severity

Ghent University Academic Bibliography

Cepstral peak prominence: a comprehensive analysis

Author: Abramowitz
Alpan
Alpan
Alpan
Awan
Awan
Awan
Awan
Awan
Balasubramanium
Balasubramanium
Blankenship
Cannito
Chen
Childers
Childers
Clapham
Dejonckere
Eadie
Esposito
Esposito
Ferrer
Fraile
Fraj
Haderlein
Haderlein
Halberstam
Hartl
Hartl
Hartl
Haykin
Heman-Ackah
Heman-Ackah
Heman-Ackah
Hillenbrand
Hillenbrand
Howard
Juan Ignacio Godino-Llorente
Kumar
Leong
Lowell
Lowell
Maryn
Maryn
Maryn
Medhurst
Mehta
Mehta
Merk
Moers
Murphy
Murphy
Murphy
Nagle
Noll
Oppenheim
Oppenheim
Peterson
Rabiner
Rosa
Rubén Fraile
Samlan
Samlan
Shanmugan
Shrivastav
Shrivastav
Shue
Solomon
Story
Vasilakis
Vipperla
Watts
Wolfe
Wolfe
Yap
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

An analytical study of cepstral peak prominence (CPP) is presented, intended to provide an insight into its meaning and relation with voice perturbation parameters. To carry out this analysis, a parametric approach is adopted in which voice production is modelled using the traditional source-filter model and the first cepstral peak is assumed to have Gaussian shape. It is concluded that the meaning of CPP is very similar to that of the first rahmonic and some insights are provided on its dependence with fundamental frequency and vocal tract resonances. It is further shown that CPP integrates measures of voice waveform and periodicity perturbations, be them either amplitude, frequency or noise

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Archivo Digital UPM

Análisis acústico de la voz: medidas temporales, espectrales y cepstrales en la voz normal con el Praat en una muestra de hablantes de español

Author: Alejandra Jiménez
Jonathan Delgado
Laura M. Izquierdo
Nieves Mª León
Publication venue: University of Castilla La Mancha; Complutense University of Madrid; Association of Speech and Language Therapist of Castilla La Mancha
Publication date: 01/01/2017
Field of study

El análisis acústico es una herramienta que proporciona información objetiva sobre la voz. En los últimos años, medidas del espectro medio a largo plazo (LTAS) y cepstrales, como la prominencia del pico cepstral suavizado (CPPs), han complementado a las medidas utilizadas tradicionalmente, demostrando en multitud de estudios una alta correlación con el grado de severidad de la disfonía. El objetivo de este trabajo descriptivo fue calcular, en el Praat, los valores de normalidad de medidas temporales, espectrales y cepstrales en una muestra de 50 hablantes de español (25 hombres y 25 mujeres) atendiendo a los principales factores que influyen en su fiabilidad como el tipo micrófono, el nivel de ruido ambiental, el programa de análisis y los parámetros acústicos utilizados. Se realizaron dos muestras de voz para cada sujeto: 1) una /a/ sostenida con la que se calcularon la CPPs y los parámetros de la frecuencia fundamental (F0), de ruido y de perturbación de la frecuencia y de la amplitud, y 2) una muestra de habla conectada donde se calcularon la CPPs y las pendientes del LTAS. Los resultados del análisis con la vocal sostenida muestran diferencias significativas en función del sexo en la F0, el jitter absoluto y en todos los parámetros de la perturbación de la amplitud y del ruido. En habla conectada se observan diferencias significativas entre hombres y mujeres en la pendiente espectral obtenida a partir de la línea de tendencia a través del LTAS y en la CPPs

Universidad de Castilla-La Mancha: Repositorio Universitario Institucional de Recursos Abiertos (RUIdeRA)

Directory of Open Access Journals

DIALNET

KLASYFIKACJA CHOROBY PARKINSONA I INNYCH ZABURZEŃ NEUROLOGICZNYCH Z WYKORZYSTANIEM EKSTRAKCJI CECH GŁOSOWYCH I TECHNIK REDUKCJI

Author: Benba Achraf
Hammouch Ahmed
Majdoubi Oumaima
Publication venue: 'Politechnika Lubelska'
Publication date: 30/09/2023
Field of study

This study aimed to differentiate individuals with Parkinson's disease (PD) from those with other neurological disorders (ND) by analyzing voice samples, considering the association between voice disorders and PD. Voice samples were collected from 76 participants using different recording devices and conditions, with participants instructed to sustain the vowel /a/ comfortably. PRAAT software was employed to extract features including autocorrelation (AC), cross-correlation (CC), and Mel frequency cepstral coefficients (MFCC) from the voice samples. Principal component analysis (PCA) was utilized to reduce the dimensionality of the features. Classification Tree (CT), Logistic Regression, Naive Bayes (NB), Support Vector Machines (SVM), and Ensemble methods were employed as supervised machine learning techniques for classification. Each method provided distinct strengths and characteristics, facilitating a comprehensive evaluation of their effectiveness in distinguishing PD patients from individuals with other neurological disorders. The Naive Bayes kernel, using seven PCA-derived components, achieved the highest accuracy rate of 86.84% among the tested classification methods. It is worth noting that classifier performance may vary based on the dataset and specific characteristics of the voice samples. In conclusion, this study demonstrated the potential of voice analysis as a diagnostic tool for distinguishing PD patients from individuals with other neurological disorders. By employing a variety of voice analysis techniques and utilizing different machine learning algorithms, including Classification Tree, Logistic Regression, Naive Bayes, Support Vector Machines, and Ensemble methods, a notable accuracy rate was attained. However, further research and validation using larger datasets are required to consolidate and generalize these findings for future clinical applications.Przedstawione badanie miało na celu różnicowanie osób z chorobą Parkinsona (PD) od osób z innymi zaburzeniami neurologicznymi poprzez analizę próbek głosowych, biorąc pod uwagę związek między zaburzeniami głosu a PD. Próbki głosowe zostały zebrane od 76 uczestników przy użyciu różnych urządzeń i warunków nagrywania, a uczestnicy byli instruowani, aby wydłużyć samogłoskę /a/ w wygodnym tempie. Oprogramowanie PRAAT zostało zastosowane do ekstrakcji cech, takich jak autokorelacja (AC), krzyżowa korelacja (CC) i współczynniki cepstralne Mel (MFCC) z próbek głosowych. Analiza składowych głównych (PCA) została wykorzystana w celu zmniejszenia wymiarowości cech. Jako techniki nadzorowanego uczenia maszynowego wykorzystano drzewa decyzyjne (CT), regresję logistyczną, naiwny klasyfikator Bayesa (NB), maszyny wektorów nośnych (SVM) oraz metody zespołowe. Każda z tych metod posiadała swoje unikalne mocne strony i charakterystyki, umożliwiając kompleksową ocenę ich skuteczności w rozróżnianiu pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Naiwny klasyfikator Bayesa, wykorzystujący siedem składowych PCA, osiągnął najwyższy wskaźnik dokładności na poziomie 86,84% wśród przetestowanych metod klasyfikacji. Należy jednak zauważyć, że wydajność klasyfikatora może się różnić w zależności od zbioru danych i konkretnych cech próbek głosowych. Podsumowując, to badanie wykazało potencjał analizy głosu jako narzędzia diagnostycznego do rozróżniania pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Poprzez zastosowanie różnych technik analizy głosu i wykorzystanie różnych algorytmów uczenia maszynowego, takich jak drzewa decyzyjne, regresja logistyczna, naiwny klasyfikator Bayesa, maszyny wektorów nośnych i metody zespołowe, osiągnięto znaczący poziom dokładności. Niemniej jednak, konieczne są dalsze badania i walidacja na większych zbiorach danych w celu skonsolidowania i uogólnienia tych wyników dla przyszłych zastosowań klinicznych

Lublin University of Technology Journals

Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

Directory of Open Access Books (DOAB)

Phonation Types in Marathi: An Acoustic Investigation

Author: Berkson Kelly Harper
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2013
Field of study

This dissertation presents a comprehensive instrumental acoustic analysis of phonation type distinctions in Marathi, an Indic language with numerous breathy voiced sonorants and obstruents. Important new facts about breathy voiced sonorants, which are crosslinguistically rare, are established: male and female speakers cue breathy phonation in sonorants differently, there are an abundance of trading relations, and--critically--phonation type distinctions are not cued as well by sonorants as by obstruents. Ten native speakers (five male, five female) were recorded producing Marathi words embedded in a carrier sentence. Tokens included plain and breathy voiced stops, affricates, nasals, laterals, rhotics, and approximants before the vowels [a] and [e]. Measures reported for consonants and subsequent vowels include duration, F0, Cepstral Peak Prominence (CPP), and corrected H1-H2*, H1-A1*, H1-A2*, and H1-A3* values. As expected, breathy voice is associated with decreased CPP and increased spectral values. A strong gender difference is revealed: low-frequency measures like H1-H2* cue breathy phonation more reliably in male speech, while CPP--which provides information about the aspiration noise included in the signal--is a more reliable cue in female speech. Trading relations are also reported: time and again, where one cue is weak or absent another cue is strong or present, underscoring the importance of including both genders and multiple vowel contexts when testing phonation type differences. Overall, the cues that are present for obstruents are not necessarily mirrored by sonorants. These findings are interpreted with reference to Dispersion Theory (Flemming 1995; Liljencrants & Lindblom 1972; Lindblom 1986, 1990). While various incarnations of Dispersion Theory focus on different aspects of perceptual and auditory distinctiveness, a basic claim is that one requirement for phonological contrasts is that they must be perceptually distinct: contrasts that are subject to great confusability are phonologically disfavored. The proposal, then, is that the typology of breathy voiced sonorants is due in part to the fact that they are not well differentiated acoustically. Breathy voiced sonorants are crosslinguistically rare because they do not make for strong phonemic contrasts

KU ScholarWorks

Acoustic analysis in mild cognitive diagnosis: systematic review of the literature in 2008–2020

Author: Rodríguez-Fandiño Juan Camilo
Publication venue: Revista Ciencias de la Salud
Publication date: 01/01/2021
Field of study

Introducción: En investigaciones recientes se han descrito cambios en la producción de tono y timbre vocal que ocurren en la edad adulta tardía. Estos cambios indican alteraciones cognitivas tempranas, incluso en etapas preclínicas de deterioro cognitivo. Este estudio tiene como objetivo identificar hallazgos relevantes de la literatura con respecto al análisis acústico en adultos mayores con deterioro cognitivo. Material y métodos: Se realizó un estudio de revisión sistemática, en el que se consultaron las siguientes bases de datos: PlosOne, Science Direct, PubMed/pmc y Google Scholar. Se utilizaron buscadores como análisis acústico, enfermedad de Alzheimer, deterioro cognitivo leve, prosodia, análisis de voz y producción de voz. Adicionalmente, se incluyen artículos empíricos que describen el análisis acústico en adultos mayores con riesgo cognitivo. La evaluación fue realizada de forma independiente por dos evaluadores, quienes determinaron el riesgo de sesgo en la revisión. Se encontraron un total de 59 artículos relacionados con el tema, de los cuales 25 cumplieron con los criterios de inclusión. Resultados: Los artículos revisados identificaron cambios en la prosodia lingüística y paralingüística, el timbre y la tonalidad vocal, que se asocian con el deterioro cognitivo en los adultos mayores. Conclusión: Los protocolos de estudio en el análisis acústico podrían ser una buena herramienta para apoyar el diagnóstico clínico diferencial del deterioro cognitivo en la edad adulta tardía y una buena oportunidad para identificar el riesgo en estadios preclínicos de demencia.Introduction: In recent research, changes in the vocal tone and timbre production that occur in late adulthood have been described. These changes indicate early cognitive disturbances, even in preclini-cal stages of cognitive decline. This study aims to identify relevant findings from the literature regarding acoustic analysis in elderly adults with cognitive impairment. Material and methods: A systematic review study was conducted, in which the following databases were consulted: PlosOne, Science Direct, PubMed/pmc, and Google Scholar. Search engines such as acoustic analysis, Alzheimer’s disease, mild cognitive impairment, prosody, voice analysis, and voice production were used. Additionally, empirical articles describing the acoustic analysis in elderly adults with cognitive risk are included. The evaluation was independently performed by two evaluators, who determined the risk of bias in the review. A total of 59 articles related to the topic were found, of which 25 met the inclusion criteria. Results: The reviewed articles identified changes in linguistic and paralinguistic prosody, timbre, and vocal tonality, which are associated with cognitive decline in the elderly. Conclusion: Study protocols in the acoustic analysis could be a good tool to support the differential clinical diagnosis of cognitive deterioration in late adulthood and a good opportunity to identify the risk in preclinical stages of dementia

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Optimization and automation of relative fundamental frequency for objective assessment of vocal hyperfunction

Author: Lien Yu-An
Publication venue
Publication date: 28/10/2015
Field of study

The project objective is to improve clinical assessment and diagnosis of the voice disorder, vocal hyperfunction (VH). VH is a condition characterized by excessive laryngeal and paralaryngeal tension, and is assumed to be the underlying cause of the majority of voice disorders. Current clinical assessment of VH is subjective and demonstrates poor inter-rater reliability. Recent work indicates that a new acoustic measure, relative fundamental frequency (RFF) is sensitive to the maladaptive functional behaviors associated with VH and can potentially be used to objectively characterize VH. Here, we explored and enhanced the potential for RFF as a measure of VH in three ways. First, the current protocol for RFF estimation was optimized to simplify the recording procedure and reduce estimation time. Second, RFF was compared with the current state-of-the-art measures of VH – listener perception of vocal effort and the aerodynamic ratio of sound pressure level to subglottal pressure level. Third, an automated algorithm that utilized the optimized recording protocol was developed and validated against manual estimation methods and listener perception. This work enables large-scale studies on RFF to determine the specific physiological elements that contribute to the measure’s ability to capture VH and may potentially provide a non-invasive and readily implemented solution for this long-standing clinical issue

Boston University Institutional Repository (OpenBU)

COMPUTERISED GRBAS ASSESSEMENT OF VOICE QUALITY

Author: Jalali Farideh
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository

Voicing quantification is more relevant than period perturbation in substitution voices: an advanced acoustical study

Author: A Olthoff
C Manfredi
C. Manfredi
CJ As Van
G Bertino
J Schoentgen
J Schoentgen
J. P. Martens
J. Schoentgen
L Crevier-Buchman
LM Immerseel Van
M Fröhlich
M Moerman
M Moerman
M. B. J. Moerman
P. H. Dejonckere
PH Dejonckere
Y Maryn
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

Quality of substitution voicing—i.e., phonation with a voice that is not generated by the vibration of two vocal folds—cannot be adequately evaluated with routinely used software for acoustic voice analysis that is aimed at ‘common’ dysphonias and nearly periodic voice signals. The AMPEX analysis program (Van Immerseel and Martens) has been shown previously to be able to detect periodicity in irregular signals with background noise, and to be suited for running speech. The validity of this analysis program is first tested using realistic synthesized voice signals with known levels of cycle-to-cycle perturbations and additive noise. Second, exhaustive acoustic analysis is performed of the voices of 116 patients surgically treated for advanced laryngeal cancer and recorded in seven European academic centers. All of them read out a short phonetically balanced passage. Patients were divided into six groups according to the oscillating structures they used to phonate. Results show that features related to quantification of voicing enable a distinction between the different groups, while the features reporting F0-instability fail to do so. Acoustic evaluation of voice quality in substitution voices thus best relies upon voicing quantification

Crossref

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

DI-fusion