11 research outputs found

    On the HHT, its problems, and some solutions

    Get PDF
    Mechanical Systems and Signal Processing, Vol.22, Number 6The empirical mode decomposition (EMD) is reviewed and some questions related to its effective performance are discussed. Its interpretation in terms of AM/FM modulation is done. Solutions for its drawbacks are proposed. Numerical simulations are carried out to empirically evaluate the proposed modified EMD

    Gear Fault Detection Based on Teager-Huang Transform

    Get PDF
    Gear fault detection based on Empirical Mode Decomposition (EMD) and Teager Kaiser Energy Operator (TKEO) technique is presented. This novel method is named as Teager-Huang transform (THT). EMD can adaptively decompose the vibration signal into a series of zero mean Intrinsic Mode Functions (IMFs). TKEO can track the instantaneous amplitude and instantaneous frequency of the Intrinsic Mode Functions at any instant. The experimental results provide effective evidence that Teager-Huang transform has better resolution than that of Hilbert-Huang transform. The Teager-Huang transform can effectively diagnose the fault of the gear, thus providing a viable processing tool for gearbox defect detection and diagnosis

    Detection of clinical depression in adolescents' using acoustic speech analysis

    Get PDF
    Clinical depression is a major risk factor in suicides and is associated with high mortality rates, therefore making it one of the leading causes of death worldwide every year. Symptoms of depression often first appear during adolescence at a time when the voice is changing, in both males and females, suggesting that specific studies of these phenomena in adolescent populations are warranted. The properties of acoustic speech have previously been investigated as possible cues for depression in adults. However, these studies were restricted to small populations of patients and the speech recordings were made during patient’s clinical interviews or fixed-text reading sessions. A collaborative effort with the Oregon research institute (ORI), USA allowed the development of a new speech corpus consisting of a large sample size of 139 adolescents (46 males and 93 females) that were divided into two groups (68 clinically depressed and 71 controls). The speech recordings were made during naturalistic interactions between adolescents and parents. Instead of covering a plethora of acoustic features in the investigation, this study takes the knowledge based from speech science and groups the acoustic features into five categories that relate to the physiological and perceptual areas of the speech production mechanism. These five acoustic feature categories consisted of the prosodic, cepstral, spectral, glottal and Teager energy operator (TEO) based features. The effectiveness in applying these acoustic feature categories in detecting adolescent’s depression was measured. The salient feature categories were determined by testing the feature categories and their combinations within a binary classification framework. In consistency with previous studies, it was observed that: - there are strong gender related differences in classification accuracy; - the glottal features provide an important enhancement of the classification accuracy when combined with other types of features; An important new contribution provided by this thesis was to observe that the TEO based features significantly outperformed prosodic, cepstral, spectral, glottal features and their combinations. An investigation into the possible reasons of such strong performance of the TEO features pointed into the importance of nonlinear mechanisms associated with the glottal flow formation as possible cues for depression

    Modulation Domain Image Processing

    Get PDF
    The classical Fourier transform is the cornerstone of traditional linearsignal and image processing. The discrete Fourier transform (DFT) and thefast Fourier transform (FFT) in particular led toprofound changes during the later decades of the last century in howwe analyze and process 1D and multi-dimensional signals.The Fourier transform represents a signal as an infinite superpositionof stationary sinusoids each of which has constant amplitude and constantfrequency. However, many important practical signals such as radar returnsand seismic waves are inherently nonstationary. Hence, more complextechniques such as the windowed Fourier transform and the wavelet transformwere invented to better capture nonstationary properties of these signals.In this dissertation, I studied an alternative nonstationary representationfor images, the 2D AM-FM model. In contrast to thestationary nature of the classical Fourier representation, the AM-FM modelrepresents an image as a finite sum of smoothly varying amplitudesand smoothly varying frequencies. The model has been applied successfullyin image processing applications such as image segmentation, texture analysis,and target tracking. However, these applications are limitedto \emph{analysis}, meaning that the computed AM and FM functionsare used as features for signal processing tasks such as classificationand recognition. For synthesis applications, few attempts have been madeto synthesize the original image from the AM and FM components. Nevertheless,these attempts were unstable and the synthesized results contained artifacts.The main reason is that the perfect reconstruction AM-FM image model waseither unavailable or unstable. Here, I constructed the first functionalperfect reconstruction AM-FM image transform that paves the way for AM-FMimage synthesis applications. The transform enables intuitive nonlinearimage filter designs in the modulation domain. I showed that these filtersprovide important advantages relative to traditional linear translation invariant filters.This dissertation addresses image processing operations in the nonlinearnonstationary modulation domain. In the modulation domain, an image is modeledas a sum of nonstationary amplitude modulation (AM) functions andnonstationary frequency modulation (FM) functions. I developeda theoretical framework for high fidelity signal and image modeling in themodulation domain, constructed an invertible multi-dimensional AM-FMtransform (xAMFM), and investigated practical signal processing applicationsof the transform. After developing the xAMFM, I investigated new imageprocessing operations that apply directly to the transformed AM and FMfunctions in the modulation domain. In addition, I introduced twoclasses of modulation domain image filters. These filters produceperceptually motivated signal processing results that are difficult orimpossible to obtain with traditional linear processing or spatial domainnonlinear approaches. Finally, I proposed three extensions of the AM-FMtransform and applied them in image analysis applications.The main original contributions of this dissertation include the following.- I proposed a perfect reconstruction FM algorithm. I used aleast-squares approach to recover the phase signal from itsgradient. In order to allow perfect reconstruction of the phase function, Ienforced an initial condition on the reconstructed phase. The perfectreconstruction FM algorithm plays a critical role in theoverall AM-FM transform.- I constructed a perfect reconstruction multi-dimensional filterbankby modifying the classical steerable pyramid. This modified filterbankensures a true multi-scale multi-orientation signal decomposition. Such adecomposition is required for a perceptually meaningful AM-FM imagerepresentation.- I rotated the partial Hilbert transform to alleviate ripplingartifacts in the computed AM and FM functions. This adjustment results inartifact free filtering results in the modulation domain.- I proposed the modulation domain image filtering framework. Iconstructed two classes of modulation domain filters. I showed that themodulation domain filters outperform traditional linear shiftinvariant (LSI) filters qualitatively and quantitatively in applicationssuch as selective orientation filtering, selective frequency filtering,and fundamental geometric image transformations.- I provided extensions of the AM-FM transform for image decompositionproblems. I illustrated that the AM-FM approach can successfullydecompose an image into coherent components such as textureand structural components.- I investigated the relationship between the two prominentAM-FM computational models, namely the partial Hilbert transformapproach (pHT) and the monogenic signal. The established relationshiphelps unify these two AM-FM algorithms.This dissertation lays a theoretical foundation for future nonlinearmodulation domain image processing applications. For the first time, onecan apply modulation domain filters to images to obtain predictableresults. The design of modulation domain filters is intuitive and simple,yet these filters produce superior results compared to those of pixeldomain LSI filters. Moreover, this dissertation opens up other research problems.For instance, classical image applications such as image segmentation andedge detection can be re-formulated in the modulation domain setting.Modulation domain based perceptual image and video quality assessment andimage compression are important future application areas for the fundamentalrepresentation results developed in this dissertation

    Automatic speaker recognition: modelling, feature extraction and effects of clinical environment

    Get PDF
    Speaker recognition is the task of establishing identity of an individual based on his/her voice. It has a significant potential as a convenient biometric method for telephony applications and does not require sophisticated or dedicated hardware. The Speaker Recognition task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker-specific feature parameters from the speech. The features are used to generate statistical models of different speakers. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Current state of the art speaker recognition systems use the Gaussian mixture model (GMM) technique in combination with the Expectation Maximization (EM) algorithm to build the speaker models. The most frequently used features are the Mel Frequency Cepstral Coefficients (MFCC). This thesis investigated areas of possible improvements in the field of speaker recognition. The identified drawbacks of the current speaker recognition systems included: slow convergence rates of the modelling techniques and feature’s sensitivity to changes due aging of speakers, use of alcohol and drugs, changing health conditions and mental state. The thesis proposed a new method of deriving the Gaussian mixture model (GMM) parameters called the EM-ITVQ algorithm. The EM-ITVQ showed a significant improvement of the equal error rates and higher convergence rates when compared to the classical GMM based on the expectation maximization (EM) method. It was demonstrated that features based on the nonlinear model of speech production (TEO based features) provided better performance compare to the conventional MFCCs features. For the first time the effect of clinical depression on the speaker verification rates was tested. It was demonstrated that the speaker verification results deteriorate if the speakers are clinically depressed. The deterioration process was demonstrated using conventional (MFCC) features. The thesis also showed that when replacing the MFCC features with features based on the nonlinear model of speech production (TEO based features), the detrimental effect of the clinical depression on speaker verification rates can be reduced

    Accurate telemonitoring of Parkinson's disease symptom severity using nonlinear speech signal processing and statistical machine learning

    Get PDF
    This study focuses on the development of an objective, automated method to extract clinically useful information from sustained vowel phonations in the context of Parkinson’s disease (PD). The aim is twofold: (a) differentiate PD subjects from healthy controls, and (b) replicate the Unified Parkinson’s Disease Rating Scale (UPDRS) metric which provides a clinical impression of PD symptom severity. This metric spans the range 0 to 176, where 0 denotes a healthy person and 176 total disability. Currently, UPDRS assessment requires the physical presence of the subject in the clinic, is subjective relying on the clinical rater’s expertise, and logistically costly for national health systems. Hence, the practical frequency of symptom tracking is typically confined to once every several months, hindering recruitment for large-scale clinical trials and under-representing the true time scale of PD fluctuations. We develop a comprehensive framework to analyze speech signals by: (1) extracting novel, distinctive signal features, (2) using robust feature selection techniques to obtain a parsimonious subset of those features, and (3a) differentiating PD subjects from healthy controls, or (3b) determining UPDRS using powerful statistical machine learning tools. Towards this aim, we also investigate 10 existing fundamental frequency (F_0) estimation algorithms to determine the most useful algorithm for this application, and propose a novel ensemble F_0 estimation algorithm which leads to a 10% improvement in accuracy over the best individual approach. Moreover, we propose novel feature selection schemes which are shown to be very competitive against widely-used schemes which are more complex. We demonstrate that we can successfully differentiate PD subjects from healthy controls with 98.5% overall accuracy, and also provide rapid, objective, and remote replication of UPDRS assessment with clinically useful accuracy (approximately 2 UPDRS points from the clinicians’ estimates), using only simple, self-administered, and non-invasive speech tests. The findings of this study strongly support the use of speech signal analysis as an objective basis for practical clinical decision support tools in the context of PD assessment.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Efficient Schemes for Adaptive Frequency Tracking and their Relevance for EEG and ECG

    Get PDF
    Amplitude and frequency are the two primary features of one-dimensional signals, and thus both are widely utilized to analysis data in numerous fields. While amplitude can be examined directly, frequency requires more elaborate approaches, except in the simplest cases. Consequently, a large number of techniques have been proposed over the years to retrieve information about frequency. The most famous method is probably power spectral density estimation. However, this approach is limited to stationary signals since the temporal information is lost. Time-frequency approaches were developed to tackle the problem of frequency estimation in non-stationary data. Although they can estimate the power of a signal in a given time interval and in a given frequency band, these tools have two drawbacks that make them less valuable in certain situations. First, due to their interdependent time and frequency resolutions, improving the accuracy in one domain means decreasing it in the other one. Second, it is difficult to use this kind of approach to estimate the instantaneous frequency of a specific oscillatory component. A solution to these two limitations is provided by adaptive frequency tracking algorithms. Typically, these algorithms use a time-varying filter (a band-pass or notch filter in most cases) to extract an oscillation, and an adaptive mechanism to estimate its instantaneous frequency. The main objective of the first part of the present thesis is to develop such a scheme for adaptive frequency tracking, the single frequency tracker. This algorithm compares favorably with existing methods for frequency tracking in terms of bias, variance and convergence speed. The most distinguishing feature of this adaptive algorithm is that it maximizes the oscillatory behavior at its output. Furthermore, due to its specific time-varying band-pass filter, it does not introduce any distortion in the extracted component. This scheme is also extended to tackle certain situations, namely the presence of several oscillations in a single signal, the related issue of harmonic components, and the availability of more than one signal with the oscillation of interest. The first extension is aimed at tracking several components simultaneously. The basic idea is to use one tracker to estimate the instantaneous frequency of each oscillation. The second extension uses the additional information contained in several signals to achieve better overall performance. Specifically, it computes separately instantaneous frequency estimates for all available signals which are then combined with weights minimizing the estimation variance. The third extension, which is based on an idea similar to the first one and uses the same weighting procedure as the second one, takes into account the harmonic structure of a signal to improve the estimation performance. A non-causal iterative method for offline processing is also developed in order to enhance an initial frequency trajectory by using future information in addition to past information. Like the single frequency tracker, this method aims at maximizing the oscillatory behavior at the output. Any approach can be used to obtain the initial trajectory. In the second part of this dissertation, the schemes for adaptive frequency tracking developed in the first part are applied to electroencephalographic and electrcardiographic data. In a first study, the single frequency tracker is used to analyze interactions between neuronal oscillations in different frequency bands, known as cross-frequency couplings, during a visual evoked potential experiment with illusory contour stimuli. With this adaptive approach ensuring that meaningful phase information is extracted, the differences in coupling strength between stimuli with and without illusory contours are more clearly highlighted than with traditional methods based on predefined filter-banks. In addition, the adaptive scheme leads to the detection of differences in instantaneous frequency. In a second study, two organization measures are derived from the harmonic extension. They are based on the power repartition in the frequency domain for the first one and on the phase relation between harmonic components for the second one. These measures, computed from the surface electrocardiogram, are shown to help predicting the outcome of catheter ablation of persistent atrial fibrillation. The proposed adaptive frequency tracking schemes are also applied to signals recorded in the field of sport sciences in order to illustrate their potential uses. To summarize, the present thesis introduces several algorithms for adaptive frequency tracking. These algorithms are presented in full detail and they are then applied to practical situations. In particular, they are shown to improve the detection of coupling mechanisms in brain activity and to provide relevant organization measures for atrial fibrillation
    corecore