4 research outputs found
Recommended from our members
Modelling and extraction of fundamental frequency in speech signals
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed
ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method
Diagnosis of the sleep apnea-hypopnea syndrome : a comprehensive approach through an intelligent system to support medical decision
[Abstract] This doctoral thesis carries out the development of an intelligent system to support medical decision in the diagnosis of the Sleep Apnea-Hypopnea Syndrome (SAHS). SAHS is the most common disorder within those affecting sleep. The estimates of the disease prevalence range from 3% to 7%. Diagnosis of SAHS requires of a polysomnographic test (PSG) to be done in the Sleep Unit of a medical center. Manual scoring of the resulting recording entails too much effort and time to the medical specialists and as a consequence it implies a high economic cost. In the developed system, automatic analysis of the PSG is accomplished which follows a comprehensive perspective. Firstly an analysis of the neurophysiological signals related to the sleep function is carried out in order to obtain the hypnogram. Then, an analysis is performed over the respiratory signals which have to be subsequently interpreted in the context of the remaining signals included in the PSG. In order to carry out such a task, the developed system is supported by the use of artificial intelligence techniques, specially focusing on the use of reasoning mechanisms capable of handling data imprecision. Ultimately, it is the aim of the proposed system to improve the diagnostic procedure and help physicians in the diagnosis of SAHS.[Resumen] Esta tesis aborda el desarrollo de un sistema inteligente de apoyo a la decisión clínica para el diagnóstico del Síndrome de Apneas-Hipopneas del Sueño (SAHS). El SAHS es el trastorno más común de aquellos que afectan al sueño. Afecta a un rango del 3% al 7% de la población con consecuencias severas sobre la salud. El diagnóstico requiere la realización de un análisis polisomnográfico (PSG) en una Unidad del Sueño de un centro hospitalario. El análisis manual de dicha prueba resulta muy costoso en tiempo y esfuerzo para el médico especialista, y como consecuencia en un elevado coste económico. El sistema desarrollado lleva a cabo el análisis automático del PSG desde una perspectiva integral. A tal efecto, primero se realiza un análisis de las señales neurofisiológicas vinculadas al sueño para obtener el hipnograma, y seguidamente, se lleva a cabo un análisis neumológico de las señales respiratorias interpretándolas en el contexto que marcan las demás señales del PSG. Para lleva a cabo dicha tarea el sistema se apoya en el uso de distintas técnicas de inteligencia artificial, con especial atención al uso mecanismos de razonamiento con soporte a la imprecisión. El principal objetivo del sistema propuesto es la mejora del procedimiento diagnóstico y ayudar a los médicos en diagnóstico del SAHS.[Resumo] Esta tese aborda o desenvolvemento dun sistema intelixente de apoio á decisión clínica para o diagnóstico do Síndrome de Apneas-Hipopneas do Sono (SAHS). O SAHS é o trastorno máis común daqueles que afectan ao sono. Afecta a un rango do 3% ao 7% da poboación con consecuencias severas sobre a saúde. O diagnóstico pasa pola realización dunha análise polisomnográfica (PSG) nunha Unidade do Sono dun centro hospitalario. A análise manual da devandita proba resulta moi custosa en tempo e esforzo para o médico especialista, e como consecuencia nun elevado custo económico. O sistema desenvolvido leva a cabo a análise automática do PSG dende unha perspectiva integral. A tal efecto, primeiro realizase unha análise dos sinais neurofisiolóxicos vinculados ao sono para obter o hipnograma, e seguidamente, lévase a cabo unha análise neumolóxica dos sinais respiratorios interpretándoos no contexto que marcan os demais sinais do PSG. Para leva a cabo esta tarefa o sistema apoiarase no uso de distintas técnicas de intelixencia artificial, con especial atención a mecanismos de razoamento con soporte para a imprecisión. O principal obxectivo do sistema proposto é a mellora do procedemento diagnóstico e axudar aos médicos no diagnóstico do SAHS
A Pitch Detector Based on a Generalized Correlation Function
Abstract—This paper proposes a novel pitch determination algorithm (PDA) based on the newly introduced concept of a generalized correlation function called correntropy. Correntropy is a positive definite kernel function which implicitly transforms the original signal into a high-dimensional reproducing kernel Hilbert space (RKHS) in a nonlinear way, and calculates very efficiently the generalized correlation in that RKHS. By incorporating the kernel function, correntropy is able to utilize higher order statistics to enhance the resolution of pitch estimation. The proposed PDA computes the summary of correntropy functions from the outputs of an equivalent rectangular bandwidth (ERB) filter bank. We present simulations on pitch determination for a single vowel, double vowels, and a benchmark database test. Simulations show that correntropy exhibits much better resolution than conventional autocorrelation in pitch determination and outperforms other PDAs in the benchmark database test. Index Terms—Correntropy, pitch determination, reproducing kernel Hilbert space (RKHS). I