3 research outputs found

    Dysphonia Detection Index (DDI): A New Multi-Parametric Marker to Evaluate Voice Quality

    Get PDF
    The rapid diffusion of voice disorders and the lack of appropriate knowledge about the problem have prompted the search for novel and reliable approaches to detect dysphonia, through easy and accessible instruments such as mobile devices. These systems represent, in fact, valid instruments to improve the patient care not only to facilitate the monitoring of symptoms of any diseases but also supporting the correct diagnosis of pathology, such as the dysphonia. In this paper, we propose a new marker, namely the dysphonia detection index, able to support the evaluation of voice disorders, which can be embedded in a mobile health solution. Four acoustic parameters are combined in a single marker to globally evaluate the state of the health of the voice and to assess the presence or not of a voice disorder. A model tree regression algorithm has been applied to define the relationship between these parameters, and the Youden analysis has been used to define the threshold value to distinguish a pathological from a healthy voice. The reliability of the proposed index has been tested in terms of correct classification of accuracy, sensitivity, and specificity. A dataset of 2003 voices has been used to evaluate the performance of our proposed index, composed of samples selected from three different databases: the Massachusetts Eye and Ear Infirmary, the Saarbruecken Voice, and the VOice ICar fEDerico II databases. Our approach achieved the best performances in comparison with other algorithms, and accuracy equals to 82.2%, while sensitivity and specificity are 82% and 82.6%, respectively

    A Review: Voice Pathology Classification Using Machine Learning

    Get PDF
    Voice pathology detection requires the presence of a specialist doctor and time to treat each patient, but it is not always possible to have a doctor who can treat all patients at once and at one precise time. For residents of remote areas, it is all expensive equipment that must be provided. Or even for people who may not be aware of having any voice pathology. Our goal is to design a diagnostic aid system to detect whether the voice is pathological or healthy, so that the patient can be referred to a doctor or not without being moved from the start. Our system is based on the classification, by SVM "Support Vector Machine", using MFCCs "Mel Frequency Cepstral Coefficients" extracted from the patient's voice. The learning and testing of our system are done using the SVD database "Saarbruecken Voice Database

    Avaliação da capacidade de estatísticas de distribuições da proeminência do pico cepstral de vogais sustentadas em distinguir disfonia hipercinética, disfonia hipocinética e laringite causada por refluxo

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Engenharia Eletrônica.Este trabalho tem como objetivo avaliar as distribuições da proeminência do pico cepstral (CPP) e da proeminência do pico cepstral suavizado (CPPS) para a vogal sustentada /a/ e suas estatísticas descritivas como discriminantes entre vozes saudáveis e vozes de pacientes patológicos diagnosticados com disfonia hiper e hipocinética e laringite causada por refluxo. As medidas foram calculadas em decibel para 184 vozes, sendo 51 saudáveis e 133 com patologias. As patologias foram separadas em 3 categorias, disfonia hipercinética, disfonia hipocinética e laringite causada por refluxo, contendo 67, 34 e 32 vozes respectivamente e foi realizada a remoção de outliers das 3 categorias e de todas as vozes patológicas e saudáveis. As medidas de CPP e CPPS foram avaliadas por 10 estatísticas distributivas, média, mediana, quinto percentil, 95-ésimo percentil, desvio padrão, curtose, assimetria, intervalo, módulo e variância para a obtenção da maior capacidade discriminante. Após avaliar as 10 estatísticas da distribuição de 2 medidas de CPP, calculado a cada 2 ms e a cada 10 ms e 2 medidas de CPPS, calculado a cada 2ms e suavização de 7 e 11 amostras para 4 casos, sendo eles saudável versus patológicos, saudável versus disfonia hipercinética, saudável versus disfonia hipocinética e saudável versus laringite causada por refluxo, o quinto percentil do CPP calculado a cada 2 ms para o caso vozes saudáveis versus disfonia hipocinética apresentou a maior capacidade discriminante, alcançando um valor-p de 3,63E-08, uma precisão de 81,18% e uma área ROC de 0,8326. Utilizando SVM foi possível melhorar este resultado e obter 88,09% de precisão de classificação, com as estatísticas de média, assimetria e quinto percentil.This study aims to evaluate the distributions of the cepstral peak prominence (CPP) and the cepstral peak prominence smoothed (CPPS) for the sustained vowel /a/ and their descriptive statistics as discriminating between healthy voices and voices of patients diagnosed with hyper and hypokinetic dysphonia and laryngitis caused by reflux. Measurements were calculated in decibel for 184 voices, 51 healthy and 133 with pathologies. The pathologies were divided into 3 categories, hyperkinetic dysphonia, hypokinetic dysphonia and laryngitis caused by reflux, containing 67, 34 and 32 voices respectively and the removal of outliers from the 3 categories and all pathological and healthy voices was performed. The CPP and CPPS measurements were evaluated by 10 distributive statistics, mean, median, fifth percentile, 95th percentile, standard deviation, kurtosis, asymmetry, interval, mode and variance to obtain the highest discriminant capacity. After evaluating the 10 distribution statistics of 2 CPP measurements, calculated every 2 ms and every 10 ms and 2 CPPS measurements, calculated every 2ms and smoothing 7 and 11 samples for 4 cases, healthy versus pathological, healthy versus hyperkinetic dysphonia, healthy versus hypokinetic dysphonia and healthy versus reflux laryngitis, the fifth percentile CPP calculated every 2 ms for the case healthy voices versus hypokinetic dysphonia had the highest discriminant capacity, reaching a p-value of 3.63E-08, an accuracy of 81.18% and a ROC area of 0.8326. Using SVM it was possible to improve this result and to obtain 88.09% of classification accuracy, with the statistics of average, asymmetry and fifth percentile
    corecore