2,119 research outputs found

    2-D Prony-Huang Transform: A New Tool for 2-D Spectral Analysis

    Full text link
    This work proposes an extension of the 1-D Hilbert Huang transform for the analysis of images. The proposed method consists in (i) adaptively decomposing an image into oscillating parts called intrinsic mode functions (IMFs) using a mode decomposition procedure, and (ii) providing a local spectral analysis of the obtained IMFs in order to get the local amplitudes, frequencies, and orientations. For the decomposition step, we propose two robust 2-D mode decompositions based on non-smooth convex optimization: a "Genuine 2-D" approach, that constrains the local extrema of the IMFs, and a "Pseudo 2-D" approach, which constrains separately the extrema of lines, columns, and diagonals. The spectral analysis step is based on Prony annihilation property that is applied on small square patches of the IMFs. The resulting 2-D Prony-Huang transform is validated on simulated and real data.Comment: 24 pages, 7 figure

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

    Full text link
    Dysarthria speech contains the pathological characteristics of vocal tract and vocal fold, but so far, they have not yet been included in traditional acoustic feature sets. Moreover, the nonlinearity and non-stationarity of speech have been ignored. In this paper, we propose a feature enhancement algorithm for dysarthria speech called WHFEMD. It combines empirical mode decomposition (EMD) and fast Walsh-Hadamard transform (FWHT) to enhance features. With the proposed algorithm, the fast Fourier transform of the dysarthria speech is first performed and then followed by EMD to get intrinsic mode functions (IMFs). After that, FWHT is used to output new coefficients and to extract statistical features based on IMFs, power spectral density, and enhanced gammatone frequency cepstral coefficients. To evaluate the proposed approach, we conducted experiments on two public pathological speech databases including UA Speech and TORGO. The results show that our algorithm performed better than traditional features in classification. We achieved improvements of 13.8% (UA Speech) and 3.84% (TORGO), respectively. Furthermore, the incorporation of an imbalanced classification algorithm to address data imbalance has resulted in a 12.18% increase in recognition accuracy. This algorithm effectively addresses the challenges of the imbalanced dataset and non-linearity in dysarthric speech and simultaneously provides a robust representation of the local pathological features of the vocal folds and tracts

    An improved higher-order analytical energy operator with adaptive local iterative filtering for early fault diagnosis of bearings

    Get PDF
    Early fault diagnosis in rolling bearings is crucial to maintenance and safety in industry. To highlight the weak fault features from complex signals combined with multiple interferences and heavy background noise, a novel approach for bearing fault diagnosis based on higher-order analytic energy operator (HO-AEO) and adaptive local iterative filtering (ALIF) is put forward. HO-AEO has better effect in dealing with heavy noise. However, it is subjected to the limitation of mono-components. To solve this limitation, ALIF is adopted firstly to decompose the nonlinear, non-stationary signals into multiple mono-components adaptively. In the next, the resonance frequency band as the optimal intrinsic mode function (IMF) is selected according to the maximum kurtosis. In the following, HO-AEO is utilized to highlight weak fault characteristics of the selected IMF. Finally, the early bearing fault is diagnosed by the energy operator spectrum based on fast Fourier transform (FFT). Comparisons in the simulation indicate that the fourth order HO-AEO shows the best performance in fault diagnosis compared with Teager energy operator (TEO), analytic energy operator (AEO), the second and the third order HO-AEO. The simulated test and experimental results demonstrate that the proposed approach could effectively extract weak fault characteristics from contaminated vibration signals

    CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

    Get PDF
    Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC

    RRCNN+^{+}: An Enhanced Residual Recursive Convolutional Neural Network for Non-stationary Signal Decomposition

    Full text link
    Time-frequency analysis is an important and challenging task in many applications. Fourier and wavelet analysis are two classic methods that have achieved remarkable success in many fields. They also exhibit limitations when applied to nonlinear and non-stationary signals. To address this challenge, a series of nonlinear and adaptive methods, pioneered by the empirical mode decomposition method have been proposed. Their aim is to decompose a non-stationary signal into quasi-stationary components which reveal better features in the time-frequency analysis. Recently, inspired by deep learning, we proposed a novel method called residual recursive convolutional neural network (RRCNN). Not only RRCNN can achieve more stable decomposition than existing methods while batch processing large-scale signals with low computational cost, but also deep learning provides a unique perspective for non-stationary signal decomposition. In this study, we aim to further improve RRCNN with the help of several nimble techniques from deep learning and optimization to ameliorate the method and overcome some of the limitations of this technique.Comment: 8 pages, 4 figur
    corecore