Search CORE

123 research outputs found

Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

Author: B. Boyanov
B. Boyanov
J.G. Proakis
J.I. Godino-Llorente
J.I. Godino-Llorente
J.I. Godino-Llorente
J.R. Deller
L. Rabiner
P.J. Murphy
R.O. Duda
S. Haykin
S.E. Bou-Ghazale
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Use of Mel Frequency Cepstral Coefficients for Automatic Pathology Detection on Sustained Vowel Phonations: Mathematical and Statistical Justification

Author: Fraile Muñoz Rubén
Godino Llorente Juan Ignacio
Gómez Vilda Pedro
Osma Ruiz Víctor
Sáenz Lechón Nicolas
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

This paper presents a justification for the use of MFCC parameters in automatic pathology detection on speech. While such an application has produced good results up to now, only partial explanations to this good performance had been given before. The herein exposed explanation consists of an interpretation of the mathematical transformations involved in MFCC calculation and a statistical analysis that confirms the conclusions drawn from the theoretical reasoning

Archivo Digital UPM

Use of Cepstrum-based parameters for automatic pathology detection on speech. Analysis of performance and theoretical justification

Author: Fraile Muñoz Rubén
Godino Llorente Juan Ignacio
Osma Ruiz Víctor
Sáenz Lechón Nicolas
Publication venue: E.U.I.T. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

The majority of speech signal analysis procedures for automatic pathology detection mostly rely on parameters extracted from time-domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral-domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

Archivo Digital UPM

A Hybrid Parameterization Technique for Speaker Identification

Author: Fernández-Baillo Gallego de la Sacristana Roberto
Gómez Vilda Pedro
Martínez Olalla Rafael
Mazaira Fernández Luis Miguel
Muñoz Cristina
Nieto Lluis Victor
Rodellar Biarge M. Victoria
Álvarez Marquina Agustin
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

Classical parameterization techniques for Speaker Identification use the codification of the power spectral density of raw speech, not discriminating between articulatory features produced by vocal tract dynamics (acoustic-phonetics) from glottal source biometry. Through the present paper a study is conducted to separate voicing fragments of speech into vocal and glottal components, dominated respectively by the vocal tract transfer function estimated adaptively to track the acoustic-phonetic sequence of the message, and by the glottal characteristics of the speaker and the phonation gesture. The separation methodology is based in Joint Process Estimation under the un-correlation hypothesis between vocal and glottal spectral distributions. Its application on voiced speech is presented in the time and frequency domains. The parameterization methodology is also described. Speaker Identification experiments conducted on 245 speakers are shown comparing different parameterization strategies. The results confirm the better performance of decoupled parameterization compared against approaches based on plain speech parameterization

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivo Digital UPM

A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices

Author
Publication venue: Springer
Publication date: 30/11/2012
Field of study

Springer - Publisher Connector

Review of Research on Speech Technology: Main Contributions From Spanish Research Groups

Author: Martínez Hinarejos Carlos D.
Ortega Alfonso
San Segundo Hernández Rubén
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

In the last two decades, there has been an important increase in research on speech technology in Spain, mainly due to a higher level of funding from European, Spanish and local institutions and also due to a growing interest in these technologies for developing new services and applications. This paper provides a review of the main areas of speech technology addressed by research groups in Spain, their main contributions in the recent years and the main focus of interest these days. This description is classified in five main areas: audio processing including speech, speaker characterization, speech and language processing, text to speech conversion and spoken language applications. This paper also introduces the Spanish Network of Speech Technologies (RTTH. Red Temática en Tecnologías del Habla) as the research network that includes almost all the researchers working in this area, presenting some figures, its objectives and its main activities developed in the last years

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RiuNet

Archivo Digital UPM

Analysis and Detection of Pathological Voice using Glottal Source Features

Author: Alku Paavo
Kadiri Sudarsana Reddy
Publication venue
Publication date: 25/09/2023
Field of study

Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximate glottal source signals computed with the zero frequency filtering (ZFF) method, and using acoustic voice signals directly. In addition, we propose to derive mel-frequency cepstral coefficients (MFCCs) from the glottal source waveforms computed by QCP and ZFF to effectively capture the variations in glottal source spectra of pathological voice. Experiments were carried out using two databases, the Hospital Universitario Principe de Asturias (HUPA) database and the Saarbrucken Voice Disorders (SVD) database. Analysis of features revealed that the glottal source contains information that discriminates normal and pathological voice. Pathology detection experiments were carried out using support vector machine (SVM). From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features. The best detection performance was achieved when the glottal source features were combined with the conventional MFCCs and PLP features, which indicates the complementary nature of the features

arXiv.org e-Print Archive

A Review of the Assessment Methods of Voice Disorders in the Context of Parkinson's Disease

Author: Abdelilah A.
Benba A.
Hammouch A.
Publication venue: Journal of Telecommunication, Electronic and Computer Engineering (JTEC)
Publication date: 01/12/2016
Field of study

In recent years, a significant progress in the field of research dedicated to the treatment of disabilities has been witnessed. This is particularly true for neurological diseases, which generally influence the system that controls the execution of learned motor patterns. In addition to its importance for communication with the outside world and interaction with others, the voice is a reflection of our personality, moods and emotions. It is a way to provide information on health status, shape, intentions, age and even the social environment. It is also a working tool for many, but an important element of life for all. Patients with Parkinson’s disease (PD) are numerous and they suffer from hypokinetic dysarthria, which is manifested in all aspects of speech production: respiration, phonation, articulation, nasalization and prosody. This paper provides a review of the methods of the assessment of speech disorders in the context of PD and also discusses the limitations

Universiti Teknikal Malaysia Melaka: UTeM Open Journal System