Search CORE

176 research outputs found

Use of Mel Frequency Cepstral Coefficients for Automatic Pathology Detection on Sustained Vowel Phonations: Mathematical and Statistical Justification

Author: Fraile Muñoz Rubén
Godino Llorente Juan Ignacio
Gómez Vilda Pedro
Osma Ruiz Víctor
Sáenz Lechón Nicolas
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

This paper presents a justification for the use of MFCC parameters in automatic pathology detection on speech. While such an application has produced good results up to now, only partial explanations to this good performance had been given before. The herein exposed explanation consists of an interpretation of the mathematical transformations involved in MFCC calculation and a statistical analysis that confirms the conclusions drawn from the theoretical reasoning

Archivo Digital UPM

Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

Author: B. Boyanov
B. Boyanov
J.G. Proakis
J.I. Godino-Llorente
J.I. Godino-Llorente
J.I. Godino-Llorente
J.R. Deller
L. Rabiner
P.J. Murphy
R.O. Duda
S. Haykin
S.E. Bou-Ghazale
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices

Author
Publication venue: Springer
Publication date: 30/11/2012
Field of study

Springer - Publisher Connector

Introducing non-linear analysis into sustained speech characterization to improve sleep apnea detection

Author: Blanco Murillo José Luis
Hernández Gómez Luis Alfonso
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

We present a novel approach for detecting severe obstructive sleep apnea (OSA) cases by introducing non-linear analysis into sustained speech characterization. The proposed scheme was designed for providing additional information into our baseline system, built on top of state-of-the-art cepstral domain modeling techniques, aiming to improve accuracy rates. This new information is lightly correlated with our previous MFCC modeling of sustained speech and uncorrelated with the information in our continuous speech modeling scheme. Tests have been performed to evaluate the improvement for our detection task, based on sustained speech as well as combined with a continuous speech classifier, resulting in a 10% relative reduction in classification for the first and a 33% relative reduction for the fused scheme. Results encourage us to consider the existence of non-linear effects on OSA patients' voices, and to think about tools which could be used to improve short-time analysis

Archivo Digital UPM

Introducing non-linear analysis into sustained speech characterization to improve sleep apnea detection

Author: A. Tsanas
A.W. Fox
G. Coccagna
J.A. Fiz
J.I. Godino-Llorente
M.P. Robb
P. Lloberes
R. Fernandez
T. Penzel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-25020-0_28Proceedings of 5th International Conference on Nonlinear Speech Processing, NOLISP 2011, Las Palmas de Gran Canaria (Spain)We present a novel approach for detecting severe obstructive sleep apnea (OSA) cases by introducing non-linear analysis into sustained speech characterization. The proposed scheme was designed for providing additional information into our baseline system, built on top of state-of-the-art cepstral domain modeling techniques, aiming to improve accuracy rates. This new information is lightly correlated with our previous MFCC modeling of sustained speech and uncorrelated with the information in our continuous speech modeling scheme. Tests have been performed to evaluate the improvement for our detection task, based on sustained speech as well as combined with a continuous speech classifier, resulting in a 10% relative reduction in classification for the first and a 33% relative reduction for the fused scheme. Results encourage us to consider the existence of non-linear effects on OSA patients’ voices, and to think about tools which could be used to improve short-time analysis.The activities described in this paper were funded by the Spanish Ministry of Science and Innovation as part of the TEC2009-14719-C02-02 (PriorSpeech) project

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

An intelligent healthcare system for detection and classification to discriminate vocal fold disorders

Author: Aguilera
Al-nasheri
Ali
Ali
Ali
Arias-Londoño
Arjmandi
Arun Kumar Sangaiah
Bhattacharyya
Bishop
Brinca
Cannito
Cohen
Cordeiro
Dhingra
Falk
Fontes
Ghulam Muhammad
Harris
Hossain
Hossain
Hossain
Hossain
Hu
Jiang
Karnell
Leonard
M. Shamim Hossain
Malki
Malki
Markaki
Martins
Massachusette Eye & Ear Infirmry Voice & Speech LAB
Mau
Mehmood
Muhammad
Muhammad
Muhammad
Muhammad
Muhammad
Muhammad
Muhammad
Parsa
Redner
Rosenthal
Roy
Selim
Wang
Werth
Yang
Zhang
Zulfiqar Ali
Zwicker
Publication venue: 'Elsevier BV'
Publication date: 01/08/2018
Field of study

The growing population of senior citizens around the world will appear as a big challenge in the future and they will engage a significant portion of the healthcare facilities. Therefore, it is necessary to develop intelligent healthcare systems so that they can be deployed in smart homes and cities for remote diagnosis. To overcome the problem, an intelligent healthcare system is proposed in this study. The proposed intelligent system is based on the human auditory mechanism and capable of detection and classification of various types of the vocal fold disorders. In the proposed system, critical bandwidth phenomena by using the bandpass filters spaced over Bark scale is implemented to simulate the human auditory mechanism. Therefore, the system acts like an expert clinician who can evaluate the voice of a patient by auditory perception. The experimental results show that the proposed system can detect the pathology with an accuracy of 99.72%. Moreover, the classification accuracy for vocal fold polyp, keratosis, vocal fold paralysis, vocal fold nodules, and adductor spasmodic dysphonia is 97.54%, 99.08%, 96.75%, 98.65%, 95.83%, and 95.83%, respectively. In addition, an experiment for paralysis versus all other disorders is also conducted, and an accuracy of 99.13% is achieved. The results show that the proposed system is accurate and reliable in vocal fold disorder assessment and can be deployed successfully for remote diagnosis. Moreover, the performance of the proposed system is better as compared to existing disorder assessment systems

University of Essex Research Repository

Crossref

Ulster University's Research Portal

Detección automática de voz hipernasal de niños con labio y paladar hendido a partir de vocales y palabras del español usando medidas clásicas y análisis no lineal

Author: Abarbanel
Golding
Hurst
Jolliffe
Kantz
Kummer
Maier
Michaelis
Orozco
Oseledec
Scholköpf
Publication venue: 'Universidad de Antioquia'
Publication date: 01/01/2016
Field of study

RESUMEN: Este artículo presenta un sistema para la detección automática de señales de voz hipernasales basado en la combinación de dos diferentes esquemas de caracterización aplicados en las cinco vocales del español y dos palabras seleccionadas. El primer esquema está basado en características clásicas como perturbaciones del periodo fundamental, medidas de ruido y coeficientes cepstrales en la frecuencia de Mel. El segundo enfoque está basado en medidas de dinámica no lineal. Las características más relevantes son seleccionadas usando dos técnicas: análisis de componentes principales y selección flotante hacia adelante secuencial. La decisión acerca de si un registro de voz es hipernasal o sano es tomada usando una máquina de soporte vectorial de margen suave. Los experimentos consideran grabaciones de las cinco vocales del idioma español y las palabras y se consideran, asimismo, tres conjuntos de características: (1) el enfoque clásico, (2) el análisis de dinámica no lineal y (3) la combinación de ambos esquemas. En general, los aciertos son mayores y más estables cuando las características clásicas y no lineales son combinadas, indicando que el análisis de dinámica no lineal se complementa con el esquema clásico.ABSTRACT: This paper presents a system for the automatic detection of hypernasal speech signals based on the combination of two different characterization approaches applied to the five spanish vowels and two selected words. The first approach is based on classical features such as pitch period perturbations, noise measures, and Mel-Frequency Cepstral Coefficients (MFCC). The second approach is based on the Non-Linear Dynamics (NLD) analysis. The most relevant features are selected and sorted using two techniques: Principal Components Analysis (PCA) and Sequential Forward Floating Selection (SFFS). The decision about whether a voice record is hypernasal or healthy is taken using a Soft Margin - Support Vector Machine (SM-SVM). Experiments upon recordings of the five Spanish vowels and the words are performed considering three different set of features: (1) the classical approach, (2) the NLD analysis, and (3) the combination of the classical and NLD measures. In general, the accuracies are higher and more stable when the classical and NLD features are combined, indicating that the NLD analysis is complementary to the classical approach

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

Biblioteca Digital del Sistema de Bibliotecas de la Universidad de Antioquia

Assessment of severe apnoea through voice analysis, automatic speech, and speaker recognition techniques

Author: Alcázar Ramírez José
Blanco José Luis
Fernández Pozo Rubén
Hernández Gómez Luis
López Gonzalo Eduardo
Toledano Doroteo T.
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2009
Field of study

The electronic version of this article is the complete one and can be found online at: http://asp.eurasipjournals.com/content/2009/1/982531This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.The activities described in this paper were funded by the Spanish Ministry of Science and Technology as part of the TEC2006-13170-C02-02 Project

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Fondo Bibliográfico Digital Institucional

Biblos-e Archivo