Search CORE

18 research outputs found

CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Author: BELHOUSSINE DRISSI Taoufiq
BOUALOULOU Nouhaila
NSIRI Benayad
Publication venue: Lublin University of Technology
Publication date: 30/06/2023
Field of study

Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC

Lublin University of Technology Journals

Oesophageal Speech’s Formants Measurement Using Wavelet Transform

Author: Amaia Mendez
Begona Garcia Zapirain
Ibon Ruiz
Publication venue: 'IntechOpen'
Publication date: 04/04/2012
Field of study

IntechOpen

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

Directory of Open Access Books (DOAB)

Seismic characterisation based on time-frequency spectral analysis

Author: Al Salmi Haifa
Publication venue
Publication date: 01/02/2022
Field of study

We present high-resolution time-frequency spectral analysis schemes to better resolve seismic images for the purpose of seismic and petroleum reservoir characterisation. Seismic characterisation is based on the physical properties of the Earth's subsurface media, and these properties are represented implicitly by seismic attributes. Because seismic traces originally presented in the time domain are non-stationary signals, for which the properties vary with time, we characterise those signals by obtaining seismic attributes which are also varying with time. Among the widely used attributes are spectral attributes calculated through time-frequency decomposition. Time-frequency spectral decomposition methods are employed to capture variations of a signal within the time-frequency domain. These decomposition methods generate a frequency vector at each time sample, referred to as the spectral component. The computed spectral component enables us to explore the additional frequency dimension which exists jointly with the original time dimension enabling localisation and characterisation of patterns within the seismic section. Conventional time-frequency decomposition methods include the continuous wavelet transform and the Wigner-Ville distribution. These methods suffer from challenges that hinder accurate interpretation when used for seismic interpretation. Continuous wavelet transform aims to decompose signals on a basis of elementary signals which have to be localised in time and frequency, but this method suffers from resolution and localisation limitations in the time-frequency spectrum. In addition to smearing, it often emerges from ill-localisation. The Wigner-Ville distribution distributes the energy of the signal over the two variables time and frequency and results in highly localised signal components. Yet, the method suffers from spurious cross-term interference due to its quadratic nature. This interference is misleading when the spectrum is used for interpretation purposes. For the specific application on seismic data the interference obscures geological features and distorts geophysical details. This thesis focuses on developing high fidelity and high-resolution time-frequency spectral decomposition methods as an extension to the existing conventional methods. These methods are then adopted as means to resolve seismic images for petroleum reservoirs. These methods are validated in terms of physics, robustness, and accurate energy localisation, using an extensive set of synthetic and real data sets including both carbonate and clastic reservoir settings. The novel contributions achieved in this thesis include developing time-frequency analysis algorithms for seismic data, allowing improved interpretation and accurate characterisation of petroleum reservoirs. The first algorithm established in this thesis is the Wigner-Ville distribution (WVD) with an additional masking filter. The standard WVD spectrum has high resolution but suffers the cross-term interference caused by multiple components in the signal. To suppress the cross-term interference, I designed a masking filter based on the spectrum of the smoothed-pseudo WVD (SP-WVD). The original SP-WVD incorporates smoothing filters in both time and frequency directions to suppress the cross-term interference, which reduces the resolution of the time-frequency spectrum. In order to overcome this side-effect, I used the SP-WVD spectrum as a reference to design a masking filter, and apply it to the standard WVD spectrum. Therefore, the mask-filtered WVD (MF-WVD) can preserve the high-resolution feature of the standard WVD while suppressing the cross-term interference as effectively as the SP-WVD. The second developed algorithm in this thesis is the synchrosqueezing wavelet transform (SWT) equipped with a directional filter. A transformation algorithm such as the continuous wavelet transform (CWT) might cause smearing in the time-frequency spectrum, i.e. the lack of localisation. The SWT attempts to improve the localisation of the time-frequency spectrum generated by the CWT. The real part of the complex SWT spectrum, after directional filtering, is capable to resolve the stratigraphic boundaries of thin layers within target reservoirs. In terms of seismic characterisation, I tested the high-resolution spectral results on a complex clastic reservoir interbedded with coal seams from the Ordos basin, northern China. I used the spectral results generated using the MF-WVD method to facilitate the interpretation of the sand distribution within the dataset. In another implementation I used the SWT spectral data results and the original seismic data together as the input to a deep convolutional neural network (dCNN), to track the horizons within a 3D volume. Using these application-based procedures, I have effectively extracted the spatial variation and the thickness of thinly layered sandstone in a coal-bearing reservoir. I also test the algorithm on a carbonate reservoir from the Tarim basin, western China. I used the spectrum generated by the synchrosqueezing wavelet transform equipped with directional filtering to characterise faults, karsts, and direct hydrocarbon indicators within the reservoir. Finally, I investigated pore-pressure prediction in carbonate layers. Pore-pressure variation generates subtle changes in the P-wave velocity of carbonate rocks. This suggests that existing empirical relations capable of predicting pore-pressure in clastic rocks are unsuitable for the prediction in carbonate rocks. I implemented the prediction based on the P-wave velocity and the wavelet transform multi-resolution analysis (WT-MRA). The WT-MRA method can unfold information within the frequency domain via decomposing the P-wave velocity. This enables us to extract and amplify hidden information embedded in the signal. Using Biot's theory, WT-MRA decomposition results can be divided into contributions from the pore-fluid and the rock framework. Therefore, I proposed a pore-pressure prediction model which is based on the pore-fluid contribution, calculated through WT-MRA, to the P-wave velocity.Open Acces

Spiral - Imperial College Digital Repository

Emotion Recognition from Speech with Acoustic, Non-Linear and Wavelet-based Features Extracted in Different Acoustic Conditions

Author: Vásquez Correa Juan Camilo
Publication venue: Medellín, Colombia
Publication date: 01/01/2016
Field of study

ABSTRACT: In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called "paralinguistic" aspects of the speech, including the emotions, and the personality of the speaker. This research work aims in the development of a methodology for the automatic emotion recognition from speech signals in non-controlled noise conditions. For that purpose, different sets of acoustic, non-linear, and wavelet based features are used to characterize emotions in different databases created for such purpose

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblioteca Digital del Sistema de Bibliotecas de la Universidad de Antioquia

Models and analysis of vocal emissions for biomedical applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

This book of Proceedings collects the papers presented at the 4th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2005, held 29-31 October 2005, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

Directory of Open Access Books (DOAB)

Advances in Vibration Analysis Research

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Vibrations are extremely important in all areas of human activities, for all sciences, technologies and industrial applications. Sometimes these Vibrations are useful but other times they are undesirable. In any case, understanding and analysis of vibrations are crucial. This book reports on the state of the art research and development findings on this very broad matter through 22 original and innovative research studies exhibiting various investigation directions. The present book is a result of contributions of experts from international scientific community working in different aspects of vibration analysis. The text is addressed not only to researchers, but also to professional engineers, students and other experts in a variety of disciplines, both academic and industrial seeking to gain a better understanding of what has been done in the field recently, and what kind of open problems are in this area

Directory of Open Access Books (DOAB)

Analysis and detection of human emotion and stress from speech signals

Author: TIN LAY NWE
Publication venue
Publication date: 03/08/2004
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

A Statistical Perspective of the Empirical Mode Decomposition

Author: Campi Marta
Publication venue: UCL (University College London)
Publication date: 28/04/2022
Field of study

This research focuses on non-stationary basis decompositions methods in time-frequency analysis. Classical methodologies in this field such as Fourier Analysis and Wavelet Transforms rely on strong assumptions of the underlying moment generating process, which, may not be valid in real data scenarios or modern applications of machine learning. The literature on non-stationary methods is still in its infancy, and the research contained in this thesis aims to address challenges arising in this area. Among several alternatives, this work is based on the method known as the Empirical Mode Decomposition (EMD). The EMD is a non-parametric time-series decomposition technique that produces a set of time-series functions denoted as Intrinsic Mode Functions (IMFs), which carry specific statistical properties. The main focus is providing a general and flexible family of basis extraction methods with minimal requirements compared to those within the Fourier or Wavelet techniques. This is highly important for two main reasons: first, more universal applications can be taken into account; secondly, the EMD has very little a priori knowledge of the process required to apply it, and as such, it can have greater generalisation properties in statistical applications across a wide array of applications and data types. The contributions of this work deal with several aspects of the decomposition. The first set regards the construction of an IMF from several perspectives: (1) achieving a semi-parametric representation of each basis; (2) extracting such semi-parametric functional forms in a computationally efficient and statistically robust framework. The EMD belongs to the class of path-based decompositions and, therefore, they are often not treated as a stochastic representation. (3) A major contribution involves the embedding of the deterministic pathwise decomposition framework into a formal stochastic process setting. One of the assumptions proper of the EMD construction is the requirement for a continuous function to apply the decomposition. In general, this may not be the case within many applications. (4) Various multi-kernel Gaussian Process formulations of the EMD will be proposed through the introduced stochastic embedding. Particularly, two different models will be proposed: one modelling the temporal mode of oscillations of the EMD and the other one capturing instantaneous frequencies location in specific frequency regions or bandwidths. (5) The construction of the second stochastic embedding will be achieved with an optimisation method called the cross-entropy method. Two formulations will be provided and explored in this regard. Application on speech time-series are explored to study such methodological extensions given that they are non-stationary

UCL Discovery