14,581 research outputs found

    Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

    Get PDF
    Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.
&#xa

    Estimation of Severity of Speech Disability through Speech Envelope

    Full text link
    In this paper, envelope detection of speech is discussed to distinguish the pathological cases of speech disabled children. The speech signal samples of children of age between five to eight years are considered for the present study. These speech signals are digitized and are used to determine the speech envelope. The envelope is subjected to ratio mean analysis to estimate the disability. This analysis is conducted on ten speech signal samples which are related to both place of articulation and manner of articulation. Overall speech disability of a pathological subject is estimated based on the results of above analysis.Comment: 8 pages,4 Figures,Signal & Image Processing Journal AIRC

    Aspects of voice irregularity measurement in connected speech

    Get PDF
    Applications of the use of connected speech material for the objective assessment of two primary physical aspects of voice quality are described and discussed. Simple auditory perceptual criteria are employed to guide the choice of analysis parameters for the physical correlate of pitch, and their utility is investigated by the measurement of the characteristics of particular examples of the normal-speaking voice. This approach is extended to the measurement of vocal fold contact phase control in connected speech and both techniques are applied to pathological voice data

    Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

    Get PDF
    The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

    Use of Mel Frequency Cepstral Coefficients for Automatic Pathology Detection on Sustained Vowel Phonations: Mathematical and Statistical Justification

    Get PDF
    This paper presents a justification for the use of MFCC parameters in automatic pathology detection on speech. While such an application has produced good results up to now, only partial explanations to this good performance had been given before. The herein exposed explanation consists of an interpretation of the mathematical transformations involved in MFCC calculation and a statistical analysis that confirms the conclusions drawn from the theoretical reasoning

    Getting better acquainted with Auditory Voice Hallucinations (AVHs): A need for clinical and social change

    Get PDF
    The phenomenon of hearing voices (AVHs) is very much a subject of current scientific interest, both clinically1 and socially. For a long time, auditory hallucinations—perceiving sounds without external stimuli (David, 2004)—were considered an obvious sign of schizophrenic or psychotic psychopathology (Goodwin et al., 1971; Larøi et al., 2012), but these days such an association is no longer taken for granted. Various recent studies in the areas of psychology, psychiatry, and neuroscience have brought a renewal of interest in AVHs. First of all, the move beyond Kraepelinian logic (van Os, 2009; Fusar-Poli et al., 2014) has led us to see AVHs as a phenomenon in their own right, and not just a characteristic of schizophrenia (Fernyhough, 2004). Furthermore, a number of studies in imaging techniques have allowed us to study the phenomenon live, as it occurs, collecting various new data (Shergill et al., 2000). On the other hand, psychological studies with attempts at modeling, have boosted the idea that AVHs are linked to the linguistic and verbal qualities of the subject, thus reducing the association between voice hallucinations and signs of pathology (Johns and van Os, 2001; Pearson et al., 2001; Stanghellini and Cutting, 2003). Other researchers have theorized that hearing voices is a different manifestation of self-awareness (Salvini and Bottini, 2011; Salvini and Quarato, 2011). Even DSM-5 has modified the importance it attaches to hallucinations, in fact although the 4th edition diagnosed “schizophrenia” simply on the basis of the symptom “hallucinations,” in the new edition hallucinations on their own are not considered a sufficient symptom to diagnose the specter of schizophrenia” (American Psychiatric Association, 2013). Many of those suffering from this condition are not under treatment and are not diagnosable in psychopathological terms, which asks ever more questions of health professionals (Iudici, 2015), and which brings with it the risk that the phenomenon of hearing voices may be considered pathological because of a lack of understanding of the problem. One direct implication of this risk concerns non-psychotic and non-schizophrenic hearers of voices who are afraid of being considered mad or disturbed, who very often live in fear for years without talking about it with anyone, although realizing that hearing voices causes no general maladjustment in their lives (Andrew et al., 2008). In the long term this can lead to feelings of alarm in some of them, and when such situations result in a visit to a clinic or a psychiatrist, there are often “suffering and conflicted confessions” about such experiences, especially by people who have never had psychiatric experience (Iudici and Gagliardo Corsi, 2017). These people consequently do not have appropriate information to help them understand their experiences (Faccio et al., 2013). This fact raises further doubts about the direct juxtaposition of auditory hallucinations and diagnoses of mental disturbance, and consequently our interest is in sensitizing clinicians to a broader interpretation of the phenomenon than the traditional view, highlighting the importance of considering more perspectives

    Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection

    Get PDF
    Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8% plus or minus 2.0%. The true positive classification performance is 95.4% plus or minus 3.2%, and the true negative performance is 91.5% plus or minus 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

Conclusions: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.
&#xa
    corecore