10 research outputs found

    A database and digital signal processing framework for the perceptual analysis of voice quality

    Get PDF
    Bermúdez de Alvear RM, Corral J, Tardón LJ, Barbancho AM, Fernández Contreras E, Rando Márquez S, Martínez-Arquero AG, Barbancho I. A database and digital signal processing framework for the perceptual analysis of voice quality. Pan European Voice Conferenc: PEVOC 11 Abstract Book. Aug. 31-Sept.2, 2015.Introduction. Clinical assessment of dysphonia relies on perceptual as much as instrumental methods of analysis [1]. The perceptual auditory analysis is potentially subject to several internal and external sources of bias [2]. Furthermore acoustic analyses which have been used to objectively characterize pathological voices are likely to be affected by confusion variables such as the signal processing or the hardware and software specifications [3]. For these reasons the poor correlation between perceptual ratings and acoustic measures remains to be a controversial matter [4]. The availability of annotated databases of voice samples is therefore of main importance for clinical and research purposes. Databases to perform digital processing of the vocal signal are usually built from English speaking subjects’ sustained vowels [5]. However phonemes vary from one language to another and to the best of our knowledge there are no annotated databases with Spanish sustained vowels from healthy or dysphonic voices. This work shows our first steps to fill in this gap. For the aim of aiding clinicians and researchers in the perceptual assessment of voice quality a two-fold objective was attained. On the one hand a database of healthy and disordered Spanish voices was developed; on the other an automatic analysis scheme was accomplished on the basis of signal processing algorithms and supervised learning machine techniques. Material and methods. A preliminary annotated database was created with 119 recordings of the sustained Spanish /a/; they were perceptually labeled by three experienced experts in vocal quality analysis. It is freely available under Links in the ATIC website (www.atic.uma.es). Voice signals were recorded using a headset condenser cardioid microphone (AKG C-544 L) positioned at 5 cm from the speaker’s mouth commissure. Speakers were instructed to sustain the Spanish vowel /a/ for 4 seconds. The microphone was connected to a digital recorder Edirol R-09HR. Voice signals were digitized at 16 bits with 44100 Hz sampling rate. Afterwards the initial and last 0.5 second segments were cut and the 3 sec. mid portion was selected for acoustic analysis. Sennheiser HD219 headphones were used by judges to perceptually evaluate voice samples. To label these recordings raters used the Grade-Roughness-Breathiness (GRB) perceptual scale which is a modified version of the original Hirano’s GRBAS scale, posteriorly modified by Dejonckere et al., [6]. In order to improve intra- and inter-raters’ agreement two types of modifications were introduced in the rating procedure, i.e. the 0-3 points scale resolution was increased by adding subintervals to the standard 0-3 intervals, and judges were provided with a written protocol with explicit definitions about the subintervals boundaries. By this way judges could compensate for the potential instability that might occur in their internal representations due to the perceptual context influence [7]. Raters’ perceptual evaluations were simultaneously performed by means of connecting the Sennheiser HD219 headphones to a multi-channel headphone preamp Behringer HA4700 Powerplay Pro-XL. The Yin algorithm [8] was selected as initial front-end to identify voiced frames and extract their fundamental frequency. For the digital processing of voice signals some conventional acoustic parameters [6] were selected. To complete the analysis the Mel-Frequency Cepstral Coefficients (MFCC) were further calculated because they are based on the auditory model and they are thus closer to the auditory system response than conventional features. Results. In the perceptual evaluation excellent intra-raters agreement and very good inter-raters agreement were achieved. During the supervised machine learning stage some conventional features were found to attain unexpected low performance in the classification scheme selected. Mel Frequency Cepstral Coefficients were promising for assorting samples with normal or quasi-normal voice quality. Discussion and conclusions. Despite it is still small and unbalanced the present annotated data base of voice samples can provide a basis for the development of other databases and automatic classification tools. Other authors [9, 10, 11] also found that modeling the auditory non-linear response during signal processing can help develop objective measures that better correspond with perceptual data. However highly disordered voices classification remains to be a challenge for this set of features since they cannot be correctly assorted by either conventional variables or the auditory model based measures. Current results warrant further research in order to find out the usability of other types of voice samples and features for the automatic classification schemes. Different digital processing steps could be used to improve the classifiers performance. Additionally other types of classifiers could be taken into account in future studies. Acknowledgment. This work was funded by the Spanish Ministerio de Economía y Competitividad, Project No. TIN2013-47276-C6-2-R has been done in the Campus de Excelencia Internacional Andalucía Tech, Universidad de Málaga. References [1] Carding PN, Wilson JA, MacKenzie K, Deary IJ. Measuring voice outcomes: state of the science review. The Journal of Laryngology and Otology 2009;123,8:823-829. [2] Oates J. Auditory-perceptual evaluation of disordered voice quality: pros, cons and future directions. Folia Phoniatrica et Logopaedica 2009;61,1:49-56. [3] Maryn et al. Meta-analysis on acoustic voice quality measures. J Acoust Soc Am 2009; 126, 5: 2619-2634. [4] Vaz Freitas et al. Correlation Between Acoustic and Audio-Perceptual Measures. J Voice 2015;29,3:390.e1 [5] “Multi-Dimensional Voice Program (MDVP) Model 5105. Software Instruction Manual”, Kay PENTAX, A Division of PENTAX Medical Company, 2 Bridgewater Lane, Lincoln Park, NJ 07035-1488 USA, November 2007. [6] Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, Van De Heyning P, Remacle M, Woisard V. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Comm. on Phoniatrics of the European Laryngological Society (ELS). Eur Arch Otorhinolaryngol 2001;258:77–82. [7] Kreiman et al. Voice Quality Perception. J Speech Hear Res 1993;36:21-4 [8] De Cheveigné A, Kawahara H. YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Amer. 202; 111,4:1917. [9] Shrivastav et al. Measuring breathiness. J Acoust Soc Am 2003;114,4:2217-2224. [10] Saenz-Lechon et al. Automatic Assessment of voice quality according to the GRBAS scale. Eng Med Biol Soc Ann 2006;1:2478-2481. [11] Fredouille et al. Back-and-forth methodology for objective voice quality assessment: from/to expert knowledge to/from automatic classification of dysphonia. EURASIP J Appl Si Pr 2009.Campus de Excelencia Internacional Andalucía Tech, Universidad de Málaga. Ministerio de Economía y Competitividad, Projecto No. TIN2013-47276-C6-2-R

    Valoración del AVQI (Acoustic Voice Quality Index) como medida de la severidad de la disfonía en castellano

    Get PDF
    Objetivos El Índice de Calidad Acústica de la Voz (AVQI) es un método objetivo para cuantificar la severidad de la disfonía basándose en el análisis de la voz continua y en una vocal sostenida. El objetivo de este estudio es validar el AVQI en castellano y estudiar su precisión diagnóstica. Metodología Se ha estudiado un grupo control 24 sujetos y un grupo de 37 sujetos disfónicos. Cada muestra de voz fue analizada perceptualmente por tres jueces para obtener el grado de severidad global de la disfonía (G). Se calculó la concordancia intra- e interjuez (ICC). La validez externa del AVQI se obtuvo mediante el análisis de la correlación de Spearman existente entre dicho índice y el parámetro perceptual G. Para estudiar la precisión diagnóstica del AVQI se empleó la curva ROC y se estimó su sensibilidad, especificidad y los cocientes de probabilidad positivo LR (+) y negativo LR (-). El análisis acústico se realizo mediante el programa PRATT utilizando el algoritmo AVQI2 de Maryn et al. Resultados Los ICC intrajuez fueron muy altos (ICC > 0,940), al igual que el ICC interjuez (0,986). La curva ROC reveló una excelente precisión diagnóstica (área bajo la curva: 0,94). El punto de corte en la presente investigación para el AVQI fue 3,20 con una sensibilidad del 0,838 y una especificidad de 0,917; LR (+)= 10,10 y LR (-)= 0,10. Comentario y Conclusiones Se ha demostrado la validez del AVQI en castellano y su precisión diagnóstica para diferenciar entre voces sanas y patológicas.Universidad de Málaga. Campus de Excelencia Internacional. Andalucia Tech

    Relaciones entre el uso de la voz y el burnout en los docentes de educación infantil y primaria de Málaga

    Get PDF
    Se analiza la relación existente entre el perfil de uso vocal y el síndrome burnout en una muestra de 244 maestros de Educación Infantil y Primaria de los colegios públicos de Málaga. Estos profesores obtienen bajas puntuaciones en las escalas de Cansancio Emocional (CE), Despersonalización (DP) y Realización Personal (RP). Se han encontrado relaciones significativas entre el perfil vocal y las escalas del burnout: CE se relaciona con el uso vocal en clase y la tensión muscular cervical; DP se asocia a la intensidad vocal usada en actividades fuera del aula; RP muestra una tendencia a la significación con la tensión cervical durante la jornada laboral

    Revista de logopedia, foniatría y audiología

    No full text
    Resumen en inglés. Resumen basado en el de la publicaciónExposición de los principios básicos del aprendizaje motor, sus fases y pautas de administración, centrándose en la aplicación de estos aspectos al tratamiento de personas con problemas de voz de origen neurodegenerativo. Se analiza también cómo las teorías del aprendizaje motor son aplicadas en el método de Tratamiento Vocal Intensivo (LSVT) para el tratamiento de los problemas fonatorios de enfermos de Parkinson, permitiendo resultados terapéuticos por encima del nivel previo al tratamientoES

    Revista de logopedia, foniatría y audiología

    No full text
    Resumen en inglésSe reflexiona sobre el tratamiento de los trastornos de la voz en las enfermedades neurodegenerativas. Por un lado, se exponen los factores que favorecen la eficacia del tratamiento de la disfonía en las enfermedades neurodegenerativas. Estos son: la precocidad en la intervención y la planificación específica de los objetivos y recursos terapéuticos en función de los factores contextuales de cada caso. Por otro lado, se explican unos protocolos de intervención utilizados para paliar los tres tipos de déficits fonorrespiratorios más comunes en esta clase de enfermedadesES

    Relaciones entre el uso de la voz y el burnout en los docentes de Educación Infantil y Primaria de Málaga

    No full text
    Se analiza la relación existente entre el perfil de uso vocal y el síndrome burnout en una muestra de 244 maestros de Educación Infantil y Primaria de los colegios públicos de Málaga. Estos profesores obtienen bajas puntuaciones en las escalas de Cansancio Emocional (CE), Despersonalización (DP) y Realización Personal (RP). Se han encontrado relaciones significativas entre el perfil vocal y las escalas del burnout: CE se relaciona con el uso vocal en clase y la tensión muscular cervical; DP se asocia a la intensidad vocal usada en actividades fuera del aula; RP muestra una tendencia a la significación con la tensión cervical durante la jornada laboral

    Revista española de pedagogía

    No full text
    Este estudio analiza la asociación existente entre el uso de la voz por los docentes en sus actividades habituales -docencia, reuniones de trabajo y ámbito privado- y el nivel de estrés laboral que acusan, el llamado síndrome del burnout -estar quemado-. Se aplica el test MBI, Inventario Burnout de Maslach, sobre una muestra de profesores de Educación Infantil y Primaria de Málaga, obteniéndose bajas puntuaciones en las escalas de Cansancio Emocional (CE) y Despersonalización (DP) y de Realización Personal (RP). Los resultados son controvertidos por no corresponder con el modelo que propone el MBI; así, los autores interpretan que existe una relación entre el uso intensivo de la voz y las escalas del burnout y proponen una nueva orientación en el modelo. Sugieren que el estudio del estrés docente es aún complejo y debe abordarse con distintas herramientas y siguiendo un modelo multidimensional.MadridES

    Evaluation of Psychometric Properties of Voice Activity and Participation Profile (VAPP): A Spanish Version

    No full text
    Objectives This study aimed to analyze the psychometric properties of the Spanish version of the Voice Activity and Participation Profile (SVAPP) questionnaire. Study design A randomized, cross-sectional sampling strategy with controls was used. Methods Two samples with a total of 169 participants were analyzed, specifically 61 men (mean age 37.02) and 108 women (mean age 37.78). Of these participants, 112 were patients and 57 were controls. The instrument was submitted to reliability (internal consistency and corrected item-total correlations) and reproducibility analyses. Validation assessment was based on the construct validity, convergent validity, discriminant validity, and concurrent validity. Results The global internal consistency was excellent (Cronbach's α = 0.976), corrected item-total correlations were satisfactory and ranged 0.63–0.89, and factor loadings were above 0.50. The different subscales showed good internal consistency (alpha coefficients ranged 0.830–0.956) and test-retest values were consistently associated. The exploratory factor analysis evidenced a strongly defined five factors internal structure, with factors loadings ranging 0.51–0.86. Convergent validity demonstrated that all subscales and scores were very strongly correlated (Pearson r above 0.735) and significantly associated. The discriminant validity analysis showed that SVAPP had good specificity to distinguish dysphonic from healthy voice subjects. Concurrent validity with Voice Handicap Index Spanish version (SVHI) showed very strong correlations between total scores, and between SVHI total score and SVAPP Daily and Social Communication subscales; correlations between both tests subscales were strong; only between SVAPP Work and SVHI Physical sections correlations were moderate. Conclusions The findings of the present study demonstrated evidence for the SVAPP questionnaire reliability and validity, and provided insightful implications of voice disorders on Spanish patients' quality of life. However, further investigations are required

    Subcutaneous anti-COVID-19 hyperimmune immunoglobulin for prevention of disease in asymptomatic individuals with SARS-CoV-2 infection: a double-blind, placebo-controlled, randomised clinical trialResearch in context

    No full text
    Summary: Background: Anti-COVID-19 hyperimmune immunoglobulin (hIG) can provide standardized and controlled antibody content. Data from controlled clinical trials using hIG for the prevention or treatment of COVID-19 outpatients have not been reported. We assessed the safety and efficacy of subcutaneous anti-COVID-19 hyperimmune immunoglobulin 20% (C19-IG20%) compared to placebo in preventing development of symptomatic COVID-19 in asymptomatic individuals with SARS-CoV-2 infection. Methods: We did a multicentre, randomized, double-blind, placebo-controlled trial, in asymptomatic unvaccinated adults (≥18 years of age) with confirmed SARS-CoV-2 infection within 5 days between April 28 and December 27, 2021. Participants were randomly assigned (1:1:1) to receive a blinded subcutaneous infusion of 10 mL with 1 g or 2 g of C19-IG20%, or an equivalent volume of saline as placebo. The primary endpoint was the proportion of participants who remained asymptomatic through day 14 after infusion. Secondary endpoints included the proportion of individuals who required oxygen supplementation, any medically attended visit, hospitalisation, or ICU, and viral load reduction and viral clearance in nasopharyngeal swabs. Safety was assessed as the proportion of patients with adverse events. The trial was terminated early due to a lack of potential benefit in the target population in a planned interim analysis conducted in December 2021. ClinicalTrials.gov registry: NCT04847141. Findings: 461 individuals (mean age 39.6 years [SD 12.8]) were randomized and received the intervention within a mean of 3.1 (SD 1.27) days from a positive SARS-CoV-2 test. In the prespecified modified intention-to-treat analysis that included only participants who received a subcutaneous infusion, the primary outcome occurred in 59.9% (91/152) of participants receiving 1 g C19-IG20%, 64.7% (99/153) receiving 2 g, and 63.5% (99/156) receiving placebo (difference in proportions 1 g C19-IG20% vs. placebo, −3.6%; 95% CI -14.6% to 7.3%, p = 0.53; 2 g C19-IG20% vs placebo, 1.1%; −9.6% to 11.9%, p = 0.85). None of the secondary clinical efficacy endpoints or virological endpoints were significantly different between study groups. Adverse event rate was similar between groups, and no severe or life-threatening adverse events related to investigational product infusion were reported. Interpretation: Our findings suggested that administration of subcutaneous human hyperimmune immunoglobulin C19-IG20% to asymptomatic individuals with SARS-CoV-2 infection was safe but did not prevent development of symptomatic COVID-19. Funding: Grifols
    corecore