138 research outputs found

    Electroglottographic Measures Based on GCI and GOI Detection Using Multiscale Product

    Get PDF
    This paper deals with glottal parameter estimation such as local pitch and open quotient from electroglottographic signal (EGG). This estimation is based on glottal closing instants and glottal opening instants determined by a multi-scale product of this signal. Wavelet transform of EGG signal is made with a quadratic spline function. Wavelet coefficients calculated on different dyadic scales, show modulus maxima at localized discontinuities of EGG signal. The detected maxima and minima correspond to the glottal opening and closing instants called GOIs and GCIs. To improve the estimate precision, we operate the multi-scale product of wavelet transform coefficients of three successive dyadic scales. This processing enhances edge detection. A Multi-scale product is a nonlinear combination of successive scales; it reduces noise and spurious peaks. We apply cubic root amplitude on the product to improve the representation of weak amplitudes. The method has a good representation of GCI and a best detection of GOI. The method was tested on the Keele University database; it is effective and robust in multiple cases even for a typical signal showing undetermined GOIs and multiple peaks at GCIs. Finally precise measurement of these instants allows accurate estimation of prosodic parameters as local pitch and open quotient

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Vocal tract resonances in singing: variation with laryngeal mechanism for male operatic singers in chest and falsetto registers

    No full text
    International audienceSeven male operatic singers sang the same notes and vowels in their chest and their falsetto registers, covering the overlap frequency range where two main laryngeal mechanisms can be identified by means of electroglottography: M1 in chest register and M2 in falsetto register. Glottal contact quotients determined using electroglottography were typically lower by 0.27 in M2 than in M1. Vocal tract resonance frequencies were measured by using broadband excitation at the lips and found to be typically lower in M2 than in M1 sung at the same pitch and vowel; R1 typically by 65 Hz and R2 by 90 Hz. These shifts in tract resonances were only weakly correlated with the changes in the contact quotient or laryngeal height that were measured simultaneously. There was considerable variability in the resonance tuning strategies used by the singers, and no evidence of a uniform systematic tuning strategy used by all singers. A simple model estimates that the shifts in resonance frequencies are consistent with the effective glottal area in falsetto register (M2) being 60%-70% of its value in chest register (M1)

    Electroglottography in Medical Diagnostics of Vocal Tract Pathologies: A Systematic Review.

    Get PDF
    Electroglottography (EGG) is a technology developed for measuring the vocal fold contact area during human voice production. Although considered subjective and unreliable as a sole diagnostic method, with the correct application of relevant computational methods, it can constitute a most promising non-invasive voice disorder diagnostic tools in a form of a digital vocal tract pathology classifier. The aim of the following study is to gather and evaluate currently existing digital voice quality assessment systems and vocal tract abnormality classification systems that rely on the use of electroglottographic bio-impedance signals. To fully comprehend the findings of this review, first the subject of EGG is introduced. For that, we summarise most relevant existing research on EGG with a particular focus on its application in diagnostics. Then, we move on to the focal point of this work, which is describing and comparing the existing EGG-based digital voice pathology classification systems. With the application of PRISMA model, 13 articles were chosen and analysed in detail. Direct comparison between chosen studies brought us to pivotal conclusions, which have been described in Section 5 of this report. Meanwhile, certain limitations arising from the literature were identified, such as questionable understanding of the nature of EGG bio-impedance signals. The appropriate recommendations for future work were made, including the application of different methods for EGG feature extraction, as well as the need for continuous EGG datasets development containing signals gathered in various conditions and with different equipments

    On the use of voice descriptors for glottal source shape parameter estimation

    Get PDF
    International audienceThis paper summarizes the results of our investigations into estimating the shape of the glottal excitation source from speech signals. We employ the Liljencrants-Fant (LF) model describing the glottal flow and its derivative. The one-dimensional glottal source shape parameter Rd describes the transition in voice quality from a tense to a breathy voice. The parameter Rd has been derived from a statistical regression of the R waveshape parameters which parameterize the LF model. First, we introduce a variant of our recently proposed adaptation and range extension of the Rd parameter regression. Secondly, we discuss in detail the aspects of estimating the glottal source shape parameter Rd using the phase minimization paradigm. Based on the analysis of a large number of speech signals we describe the major conditions that are likely to result in erroneous Rd estimates. Based on these findings we investigate into means to increase the robustness of the Rd parameter estimation. We use Viterbi smoothing to suppress unnatural jumps of the estimated Rd parameter contours within short time segments. Additionally, we propose to steer the Viterbi algorithm by exploiting the covariation of other voice descriptors to improve Viterbi smoothing. The novel Viterbi steering is based on a Gaussian Mixture Model (GMM) that represents the joint density of the voice descriptors and the Open Quotient (OQ) estimated from corresponding electroglottographic (EGG) signals. A conversion function derived from the mixture model predicts OQ from the voice descriptors. Converted to Rd it defines an additional prior probability to adapt the partial probabilities of the Viterbi algorithm accordingly. Finally, we evaluate the performances of the phase minimization based methods using both variants to adapt and extent the Rd regression on one synthetic test set as well as in combination with Viterbi smoothing and each variant of the novel Viterbi steering on one test set of natural speech. The experimental findings exhibit improvements for both Viterbi approaches

    Vocal fold vibratory patterns in tense versus lax phonation contrasts

    Full text link
    This study explores the vocal fold contact patterns of one type of phonation contrast--the tense vs lax phonation contrasts of three Yi (Loloish) languages. These contrasts are interesting because neither phonation category is very different from modal voice, and because both phonations are largely independent of the languages' tonal contrasts. Electroglottographic (EGG) recordings were made in the field, and traditional EGG measures were derived. These showed many small but significant differences between the phonations, with tense phonation having greater contact quotients and briefer but slower changes in contact. Functional data analysis was then applied to entire EGG pulse shapes. The resulting first principal component was found to be mostly strongly related to the phonation contrasts, and correlated with almost all the traditional EGG measures. Unlike the traditional measures, however, this component also seems to capture differences in abruptness of contact. Furthermore, previously collected perceptual responses from native speakers of one of the languages correlated better with this component than with any other EGG measure or any acoustic measure. The differences between these tense and lax phonations are not large, but apparently they are consistent enough, and perceptually robust enough, to support this linguistic contrast

    Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

    Get PDF
    Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.
&#xa

    The impact of a standardized vocal loading test on vocal fold oscillations

    Get PDF
    Introduction Vocal loading capacity is an important aspect of vocal health and is measured using standardized vocal loading tests. However, it remains unclear how vocal fold oscillation patterns are influenced by a standardized vocal loading task. Methods 21 (10 male, 11 female) vocally healthy subjects were analyzed concerning the dysphonia severity index (DSI) and high speed videolaryngoscopy (HSV) on the vowel /i/ at a comfortable pitch and loudness before and after a standardized vocal loading test (10 min standardized text reading, at a level higher than 80 dB (A) measured at 30 cm from the mouth). Results Changes in DSI were statistically significant, diminishing by 1.2 points after the vocal loading test, which was mainly caused by an increase of the minimum intensity. However, the pre-post comparison of HSV derived measures failed to show any statistically significant changes. Conclusion It seems necessary to analyze the effects of a standardized vocal loading test on vocal fold oscillation patterns with respect to softest phonation and phonation threshold pressure rather than comfortable pitch and loudness.Level of evidenc

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies
    • …
    corecore