2,119 research outputs found
2-D Prony-Huang Transform: A New Tool for 2-D Spectral Analysis
This work proposes an extension of the 1-D Hilbert Huang transform for the
analysis of images. The proposed method consists in (i) adaptively decomposing
an image into oscillating parts called intrinsic mode functions (IMFs) using a
mode decomposition procedure, and (ii) providing a local spectral analysis of
the obtained IMFs in order to get the local amplitudes, frequencies, and
orientations. For the decomposition step, we propose two robust 2-D mode
decompositions based on non-smooth convex optimization: a "Genuine 2-D"
approach, that constrains the local extrema of the IMFs, and a "Pseudo 2-D"
approach, which constrains separately the extrema of lines, columns, and
diagonals. The spectral analysis step is based on Prony annihilation property
that is applied on small square patches of the IMFs. The resulting 2-D
Prony-Huang transform is validated on simulated and real data.Comment: 24 pages, 7 figure
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform
Dysarthria speech contains the pathological characteristics of vocal tract
and vocal fold, but so far, they have not yet been included in traditional
acoustic feature sets. Moreover, the nonlinearity and non-stationarity of
speech have been ignored. In this paper, we propose a feature enhancement
algorithm for dysarthria speech called WHFEMD. It combines empirical mode
decomposition (EMD) and fast Walsh-Hadamard transform (FWHT) to enhance
features. With the proposed algorithm, the fast Fourier transform of the
dysarthria speech is first performed and then followed by EMD to get intrinsic
mode functions (IMFs). After that, FWHT is used to output new coefficients and
to extract statistical features based on IMFs, power spectral density, and
enhanced gammatone frequency cepstral coefficients. To evaluate the proposed
approach, we conducted experiments on two public pathological speech databases
including UA Speech and TORGO. The results show that our algorithm performed
better than traditional features in classification. We achieved improvements of
13.8% (UA Speech) and 3.84% (TORGO), respectively. Furthermore, the
incorporation of an imbalanced classification algorithm to address data
imbalance has resulted in a 12.18% increase in recognition accuracy. This
algorithm effectively addresses the challenges of the imbalanced dataset and
non-linearity in dysarthric speech and simultaneously provides a robust
representation of the local pathological features of the vocal folds and
tracts
An improved higher-order analytical energy operator with adaptive local iterative filtering for early fault diagnosis of bearings
Early fault diagnosis in rolling bearings is crucial to maintenance and safety in industry. To highlight the weak fault features from complex signals combined with multiple interferences and heavy background noise, a novel approach for bearing fault diagnosis based on higher-order analytic energy operator (HO-AEO) and adaptive local iterative filtering (ALIF) is put forward. HO-AEO has better effect in dealing with heavy noise. However, it is subjected to the limitation of mono-components. To solve this limitation, ALIF is adopted firstly to decompose the nonlinear, non-stationary signals into multiple mono-components adaptively. In the next, the resonance frequency band as the optimal intrinsic mode function (IMF) is selected according to the maximum kurtosis. In the following, HO-AEO is utilized to highlight weak fault characteristics of the selected IMF. Finally, the early bearing fault is diagnosed by the energy operator spectrum based on fast Fourier transform (FFT). Comparisons in the simulation indicate that the fourth order HO-AEO shows the best performance in fault diagnosis compared with Teager energy operator (TEO), analytic energy operator (AEO), the second and the third order HO-AEO. The simulated test and experimental results demonstrate that the proposed approach could effectively extract weak fault characteristics from contaminated vibration signals
CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC
Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC
RRCNN: An Enhanced Residual Recursive Convolutional Neural Network for Non-stationary Signal Decomposition
Time-frequency analysis is an important and challenging task in many
applications. Fourier and wavelet analysis are two classic methods that have
achieved remarkable success in many fields. They also exhibit limitations when
applied to nonlinear and non-stationary signals. To address this challenge, a
series of nonlinear and adaptive methods, pioneered by the empirical mode
decomposition method have been proposed. Their aim is to decompose a
non-stationary signal into quasi-stationary components which reveal better
features in the time-frequency analysis. Recently, inspired by deep learning,
we proposed a novel method called residual recursive convolutional neural
network (RRCNN). Not only RRCNN can achieve more stable decomposition than
existing methods while batch processing large-scale signals with low
computational cost, but also deep learning provides a unique perspective for
non-stationary signal decomposition. In this study, we aim to further improve
RRCNN with the help of several nimble techniques from deep learning and
optimization to ameliorate the method and overcome some of the limitations of
this technique.Comment: 8 pages, 4 figur
- …