3 research outputs found

    A Voice Disease Detection Method Based on MFCCs and Shallow CNN

    Full text link
    The incidence rate of voice diseases is increasing year by year. The use of software for remote diagnosis is a technical development trend and has important practical value. Among voice diseases, common diseases that cause hoarseness include spasmodic dysphonia, vocal cord paralysis, vocal nodule, and vocal cord polyp. This paper presents a voice disease detection method that can be applied in a wide range of clinical. We cooperated with Xiangya Hospital of Central South University to collect voice samples from sixty-one different patients. The Mel Frequency Cepstrum Coefficient (MFCC) parameters are extracted as input features to describe the voice in the form of data. An innovative model combining MFCC parameters and single convolution layer CNN is proposed for fast calculation and classification. The highest accuracy we achieved was 92%, it is fully ahead of the original research results and internationally advanced. And we use Advanced Voice Function Assessment Databases (AVFAD) to evaluate the generalization ability of the method we proposed, which achieved an accuracy rate of 98%. Experiments on clinical and standard datasets show that for the pathological detection of voice diseases, our method has greatly improved in accuracy and computational efficiency

    A Pipeline to Evaluate the Effects of Noise on Machine Learning Detection of Laryngeal Cancer

    Get PDF
    Cases of laryngeal cancer are rising, with diagnosis often involving invasive biopsy procedures. An alternate approach is to identify high-risk patients by analysis of voice recordings which can alert clinical teams to those patients that need prioritisation. We propose a pipeline for evaluating speech classifier performance in the presence of noise. We perform experiments using the pipeline with several classifiers and denoising techniques. Random forest classifier performed best with an accuracy of 81.2% on clean data dropping to 63.8% when noise was added to recordings. The accuracy of all classifiers was reduced by added noise, signal denoising improved classifier accuracy but could not fully reverse the effects of noise. The effects of noise on classification is a complex issue which must be resolved for these detection systems to be implemented in clinical practice. We show that the proposed pipeline allows for the evaluation of classifier performance in the presence of noise

    ByoVoz Automatic Voice Condition Analysis System for the 2018 FEMH Challenge

    Full text link
    This paper presents the methods and results used by the ByoVoz team for the design of an automatic voice condition analysis system, which was submitted to the 2018 Far East Memorial Hospital voice data challenge. The proposed methodology is based on a cascading scheme that firstly discriminates between pathological and normophonic voices, and then identifies the type of disorder. By using diverse feature selection techniques, a subset of complexity, spectral/cepstral and perturbation characteristics were identified for the proposed tasks. Then, several generative classification methodologies based on Gaussian Mixture Models and Gradient Boosting were employed to provide decisions about the input voices in the binary classification, and using onevs-one classification systems based on Random Forests for the categorization according to the type of disorder. By using a 4-folds cross-validation approach on the training partition a sensitivity=0.93 and specificity=0.74 were obtained. Similarly, an unweighted average recall of 0.63 and an accuracy of 66% was obtained for the identification task. Using the scoring metric proposed in the challenge the final resulting score considering both detection and identification is of 0.77
    corecore