16 research outputs found

    Pathological Speech Classification Using a Convolutional Neural Network

    Get PDF
    Convolutional Neural Networks (CNNs) have enabled significant improvements across a number of applications in computer vision such as object detection, face recognition and image classification. An audio signal can be visually represented as a spectrogram that captures the time-varying frequency content of the signal. This paper describes how a CNN can be applied to the spectrogram of an audio signal to distinguish pathological from healthy speech. We propose a CNN structure and implement it using Keras to test the approach. A classification accuracy of over 95% is obtained in experiments on two public pathological speech datasets

    Transfer learning with audioSet to voice pathologies identification in continuous speech

    Get PDF
    The classification of pathological diseases with the implementation of concepts of Deep Learning has been increasing considerably in recent times. Among the works developed there are good results for the classification in sustained speech with vowels, but few related works for the classification in continuous speech. This work uses the German Saarbrücken Voice Database with the phrase “Guten Morgen, wie geht es Ihnen?” to classify four classes: dysphonia, laryngitis, paralysis of vocal cords and healthy voices. Transfer learning concepts were used with the AudioSet database. Two models were developed based on Long-Short-Term-Memory and Convolutional Network for classification of extracted embeddings and comparison of the best results, using cross-validation. The final results allowed to obtaining 40% of f1-score for the four classes, 66% f1-score for Dysphonia x Healthy, 67% for Laryngitis x healthy and 80% for Paralysis x Healthy.info:eu-repo/semantics/publishedVersio

    A Review: Voice Pathology Classification Using Machine Learning

    Get PDF
    Voice pathology detection requires the presence of a specialist doctor and time to treat each patient, but it is not always possible to have a doctor who can treat all patients at once and at one precise time. For residents of remote areas, it is all expensive equipment that must be provided. Or even for people who may not be aware of having any voice pathology. Our goal is to design a diagnostic aid system to detect whether the voice is pathological or healthy, so that the patient can be referred to a doctor or not without being moved from the start. Our system is based on the classification, by SVM "Support Vector Machine", using MFCCs "Mel Frequency Cepstral Coefficients" extracted from the patient's voice. The learning and testing of our system are done using the SVD database "Saarbruecken Voice Database

    Voice pathologies : the most comum features and classification tools

    Get PDF
    Speech pathologies are quite common in society, however the exams that exist are invasive, making them uncomfortable for patients and depending on the experience of the clinician who performs the assessment. Hence the need to develop non-invasive methods, which allow objective and efficient analysis. Taking this need into account in this work, the most promising list of features and classifiers was identified. As features, jitter, shimmer, HNR, LPC, PLP, and MFCC were identified and as classifiers CNN, RNN and LSTM. This study intends to develop a device to support medical decision, however this article already presents the system interface.info:eu-repo/semantics/publishedVersio

    Predicción de la enfermedad de Parkinson utilizando redes neuronales convolucionales

    Get PDF
    La enfermedad de Parkinson (EP) es un desorden neurodegenerativo del sistema nervioso, de causa desconocida y curso crónico, progresivo e irreversible. En la actualidad se asume que los cambios patofisiológicos que permiten apreciar los síntomas de la enfermedad, no son visibles hasta al menos cuatro años luego de su inicio. Por esta causa, se buscan métodos alternativos que permitan detectar la enfermedad en forma temprana. Dado que las deficiencias del habla es uno de los síntomas de la enfermedad, esto puede dar origen a un biomarcador para el diagnóstico temprano y el monitoreo de la enfermedad. Este trabajo propone un estudio a partir del aprendizaje profundo de los espectrogramas obtenidos de señales de voz grabadas con celulares. Como objetivo se plantea realizar aportes al diagnóstico de EP, contribuyendo asimismo al conocimiento de las características de la voz afectadas por la enfermedad. Para tal fin se creará una base de datos de espectrogramas de los segmentos de audio que mejor permitan caracterizar la voz de los EP. Se desarrollarán modelos de redes neuronales convolucionales con distintas arquitecturas para distinguir los EP de los pacientes sanos, utilizando la validación adecuada para las características de dichos datos.Base de Datos y Minería de Datos.Red de Universidades con Carreras en Informátic
    corecore