1,053 research outputs found

    Intra- and Inter-database Study for Arabic, English, and German Databases:Do Conventional Speech Features Detect Voice Pathology?

    Get PDF
    A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At the same time, automatic systems can be deployed in remote areas where a general practitioner can use them and may refer the patient to a specialist to avoid complications that may be life threatening. Many automatic systems for disorder detection have been developed by applying different types of conventional speech features such as the linear prediction coefficients, linear prediction cepstral coefficients, and Mel-frequency cepstral coefficients (MFCCs). This study aims to ascertain whether conventional speech features detect voice pathology reliably, and whether they can be correlated with voice quality. To investigate this, an automatic detection system based on MFCC was developed, and three different voice disorder databases were used in this study. The experimental results suggest that the accuracy of the MFCC-based system varies from database to database. The detection rate for the intra-database ranges from 72% to 95%, and that for the inter-database is from 47% to 82%. The results conclude that conventional speech features are not correlated with voice, and hence are not reliable in pathology detection

    A zero-watermarking algorithm for privacy protection in biomedical signals

    Get PDF
    Confidentiality of health information is indispensable to protect privacy of an individual. However, recent advances in electronic healthcare systems allow transmission of sensitive information through the Internet, which is prone to various vulnerabilities, attacks and may leads to unauthorized disclosure. Such situations may not only create adverse effects for individuals but may also cause severe consequences such as hefty regulatory fines, bad publicity, legal fees, and forensics. To avoid such predicaments, a privacy protected healthcare system is proposed in this study that protects the identity of an individual as well as detects vocal fold disorders. The privacy of the developed healthcare system is based on the proposed zero-watermarking algorithm, which embeds a watermark in a secret key instead of the signals to avoid the distortion in an audio sample. The identity is protected by the generation of its secret shares through visual cryptography. The generated shares are embedded by finding the patterns into the audio with the application of one-dimensional local binary pattern. The proposed zero-watermarking algorithm is evaluated by using audio samples taken from the Massachusetts Eye and Ear Infirmary voice disorder database. Experimental results demonstrate that the proposed algorithm achieves imperceptibility and is reliable in its extraction of identity. In addition, the proposed algorithm does not affect the results of disorder detection and it is robust against noise attacks of various signal-to-noise ratios

    An intelligent healthcare system for detection and classification to discriminate vocal fold disorders

    Get PDF
    The growing population of senior citizens around the world will appear as a big challenge in the future and they will engage a significant portion of the healthcare facilities. Therefore, it is necessary to develop intelligent healthcare systems so that they can be deployed in smart homes and cities for remote diagnosis. To overcome the problem, an intelligent healthcare system is proposed in this study. The proposed intelligent system is based on the human auditory mechanism and capable of detection and classification of various types of the vocal fold disorders. In the proposed system, critical bandwidth phenomena by using the bandpass filters spaced over Bark scale is implemented to simulate the human auditory mechanism. Therefore, the system acts like an expert clinician who can evaluate the voice of a patient by auditory perception. The experimental results show that the proposed system can detect the pathology with an accuracy of 99.72%. Moreover, the classification accuracy for vocal fold polyp, keratosis, vocal fold paralysis, vocal fold nodules, and adductor spasmodic dysphonia is 97.54%, 99.08%, 96.75%, 98.65%, 95.83%, and 95.83%, respectively. In addition, an experiment for paralysis versus all other disorders is also conducted, and an accuracy of 99.13% is achieved. The results show that the proposed system is accurate and reliable in vocal fold disorder assessment and can be deployed successfully for remote diagnosis. Moreover, the performance of the proposed system is better as compared to existing disorder assessment systems

    Glottal flow characteristics in vowels produced by speakers with heart failure

    Get PDF
    Heart failure (HF) is one of the most life-threatening diseases globally. HF is an under-diagnosed condition, and more screening tools are needed to detect it. A few recent studies have suggested that HF also affects the functioning of the speech production mechanism by causing generation of edema in the vocal folds and by impairing the lung function. It has not yet been studied whether these possible effects of HF on the speech production mechanism are large enough to cause acoustically measurable differences to distinguish speech produced in HF from that produced by healthy speakers. Therefore, the goal of the present study was to compare speech production between HF patients and healthy controls by focusing on the excitation signal generated at the level of the vocal folds, the glottal flow. The glottal flow was computed from speech using the quasi-closed phase glottal inverse filtering method and the estimated flow was parameterized with 12 glottal parameters. The sound pressure level (SPL) was measured from speech as an additional parameter. The statistical analyses conducted on the parameters indicated that most of the glottal parameters and SPL were significantly different between the HF patients and healthy controls. The results showed that the HF patients generally produced a more rounded glottal pulse and a lower SPL level compared to the healthy controls, indicating incomplete glottal closure and inappropriate leakage of air through the glottis. The results observed in this preliminary study indicate that glottal features are capable of distinguishing speakers with HF from healthy controls. Therefore, the study suggests that glottal features constitute a potential feature extraction approach which should be taken into account in future large-scale investigations in studying the automatic detection of HF from speech.Peer reviewe

    Optimizing laryngeal pathology detection by using combined cepstral features

    Get PDF
    ABSTRACT There are several diseases that affect the human voice quality which can be organic or neurological. Acoustic analysis of voice features can be used as a complementary and noninvasive tool for the diagnosis of laryngeal pathologies. The degree of reliability and effectiveness of the discriminating process depends on the appropriate acoustic feature extraction. This work presents a parametric method based on cepstral features to discriminate pathological voices of speakers affected by vocal fold edema and paralysis from healthy voices. Cepstral, weighted cepstral, delta cepstral, and weighted delta cepstral coefficients are obtained from speech signals. A Vector Quantization is carried out individually for each feature in the classification process, associated with a distortion measurement. The goal is to evaluate a performance of a classifier based on the individual and combined cepstral features. The average, the product and the weighted average are the different combination strategies applied yielding a multiple classifier that is more efficient than each individual technique. To assess the accuracy of the system, 153 speech files of sustained vowel /ah/ (53 healthy, 44 vocal fold edema and 56 paralysis) of the Disordered Voice Database from Massachusetts Eye and Ear Infirmary (MEEI) are used. Results show that the employed parameters are complementary and they can be used to detect vocal disorders caused by the presence of vocal fold pathologies

    Wavelet description of the Glottal Gap

    Get PDF
    The Glottal Source correlates reconstructed from the phonated parts of voice may render interesting information with applicability in different fields. One of them is defective closure (gap) detection. Through the paper the background to explain the physical foundations of defective gap are reviewed. A possible method to estimate defective gap is also presented based on a Wavelet Description of the Glottal Source. The method is validated using results from the analysis of a gender-balanced speakers database. Normative values for the different parameters estimated are given. A set of study cases with deficient glottal closure is presented and discussed

    A New PSO Classifier Based Method Applied to Detect Anomalies of the Larynx

    Get PDF
    Quality of the human voice can be affected by anomalies of the larynx due to the physical, Nerve-muscle or only nervous origins. Video Stroboscope and vocal folds movement display systems are key tools which often used to detect Laryngeal anomalies. These methods are invasive, time consuming and expensive, so researchers are trying to find non-invasive methods that lead to the final answers faster than invasive methods and contain tolerable condition for patients. Many interests are directed to the application of speech processing techniques in relevant works. In these works, researchers were used different processing methods in medical engineering to detect anomalies. Recently, variety of researches presented to detect anomalies from the audio signals of individuals based on the features that extracted from audio signals. These methods have been conducted to separate patient audio from non-patient once. These researches do not work properly when an anomaly is among several anomalies and achieve bad error rate. In this paper, we aim to propose a new method of automatic Anomalies detection which performs based on a new mechanism of feature extraction and a PSO classifier. In the proposed work, Feature extraction is done in three ways, the first depending on MFCC features and the second depending on Jitter and Shimmer features and the third by combining MFCC and Jitter and Shimmer. Meanwhile, achieved features are used along with PSO algorithm to analysis and classify anomalies based on several classes. Also, we used four groups of anomalies and a class of normal voice as benchmark data sets and evaluated and compared the proposed method with different feature extraction strategy. Our simulations results confirm the superior performance of the proposed method, especially when the features are extracted based on combination of MFCC and Jitter Shimmer. The result from the combination is 80% and using MFCC alone is 66% and using Shimmer and Jitter is 43%
    corecore