484 research outputs found

    Optimal set of EEG features for emotional state classification and trajectory visualization in Parkinson's disease

    Get PDF
    In addition to classic motor signs and symptoms, individuals with Parkinson's disease (PD) are characterized by emotional deficits. Ongoing brain activity can be recorded by electroencephalograph (EEG) to discover the links between emotional states and brain activity. This study utilized machine-learning algorithms to categorize emotional states in PD patients compared with healthy controls (HC) using EEG. Twenty non-demented PD patients and 20 healthy age-, gender-, and education level-matched controls viewed happiness, sadness, fear, anger, surprise, and disgust emotional stimuli while fourteen-channel EEG was being recorded. Multimodal stimulus (combination of audio and visual) was used to evoke the emotions. To classify the EEG-based emotional states and visualize the changes of emotional states over time, this paper compares four kinds of EEG features for emotional state classification and proposes an approach to track the trajectory of emotion changes with manifold learning. From the experimental results using our EEG data set, we found that (a) bispectrum feature is superior to other three kinds of features, namely power spectrum, wavelet packet and nonlinear dynamical analysis; (b) higher frequency bands (alpha, beta and gamma) play a more important role in emotion activities than lower frequency bands (delta and theta) in both groups and; (c) the trajectory of emotion changes can be visualized by reducing subject-independent features with manifold learning. This provides a promising way of implementing visualization of patient's emotional state in real time and leads to a practical system for noninvasive assessment of the emotional impairments associated with neurological disorders

    Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

    Get PDF
    Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) and k-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature

    Perceptually Motivated Wavelet Packet Transform for Bioacoustic Signal Enhancement

    Get PDF
    A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim–Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions

    Emotion Recognition from Speech with Acoustic, Non-Linear and Wavelet-based Features Extracted in Different Acoustic Conditions

    Get PDF
    ABSTRACT: In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called "paralinguistic" aspects of the speech, including the emotions, and the personality of the speaker. This research work aims in the development of a methodology for the automatic emotion recognition from speech signals in non-controlled noise conditions. For that purpose, different sets of acoustic, non-linear, and wavelet based features are used to characterize emotions in different databases created for such purpose

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Comprehensive Study of Automatic Speech Emotion Recognition Systems

    Get PDF
    Speech emotion recognition (SER) is the technology that recognizes psychological characteristics and feelings from the speech signals through techniques and methodologies. SER is challenging because of more considerable variations in different languages arousal and valence levels. Various technical developments in artificial intelligence and signal processing methods have encouraged and made it possible to interpret emotions.SER plays a vital role in remote communication. This paper offers a recent survey of SER using machine learning (ML) and deep learning (DL)-based techniques. It focuses on the various feature representation and classification techniques used for SER. Further, it describes details about databases and evaluation metrics used for speech emotion recognition

    Brain-computer interface of focus and motor imagery using wavelet and recurrent neural networks

    Get PDF
    Brain-computer interface is a technology that allows operating a device without involving muscles and sound, but directly from the brain through the processed electrical signals. The technology works by capturing electrical or magnetic signals from the brain, which are then processed to obtain information contained therein. Usually, BCI uses information from electroencephalogram (EEG) signals based on various variables reviewed. This study proposed BCI to move external devices such as a drone simulator based on EEG signal information. From the EEG signal was extracted to get motor imagery (MI) and focus variable using wavelet. Then, they were classified by recurrent neural networks (RNN). In overcoming the problem of vanishing memory from RNN, was used long short-term memory (LSTM). The results showed that BCI used wavelet, and RNN can drive external devices of non-training data with an accuracy of 79.6%. The experiment gave AdaDelta model is better than the Adam model in terms of accuracy and value losses. Whereas in computational learning time, Adam's model is faster than AdaDelta's model

    Multi-modal association learning using spike-timing dependent plasticity (STDP)

    Get PDF
    We propose an associative learning model that can integrate facial images with speech signals to target a subject in a reinforcement learning (RL) paradigm. Through this approach, the rules of learning will involve associating paired stimuli (stimulus–stimulus, i.e., face–speech), which is also known as predictor-choice pairs. Prior to a learning simulation, we extract the features of the biometrics used in the study. For facial features, we experiment by using two approaches: principal component analysis (PCA)-based Eigenfaces and singular value decomposition (SVD). For speech features, we use wavelet packet decomposition (WPD). The experiments show that the PCA-based Eigenfaces feature extraction approach produces better results than SVD. We implement the proposed learning model by using the Spike- Timing-Dependent Plasticity (STDP) algorithm, which depends on the time and rate of pre-post synaptic spikes. The key contribution of our study is the implementation of learning rules via STDP and firing rate in spatiotemporal neural networks based on the Izhikevich spiking model. In our learning, we implement learning for response group association by following the reward-modulated STDP in terms of RL, wherein the firing rate of the response groups determines the reward that will be given. We perform a number of experiments that use existing face samples from the Olivetti Research Laboratory (ORL) dataset, and speech samples from TIDigits. After several experiments and simulations are performed to recognize a subject, the results show that the proposed learning model can associate the predictor (face) with the choice (speech) at optimum performance rates of 77.26% and 82.66% for training and testing, respectively. We also perform learning by using real data, that is, an experiment is conducted on a sample of face–speech data, which have been collected in a manner similar to that of the initial data. The performance results are 79.11% and 77.33% for training and testing, respectively. Based on these results, the proposed learning model can produce high learning performance in terms of combining heterogeneous data (face–speech). This finding opens possibilities to expand RL in the field of biometric authenticatio

    Improving Quality of Life: Home Care for Chronically Ill and Elderly People

    Get PDF
    In this chapter, we propose a system especially created for elderly or chronically ill people that are with special needs and poor familiarity with technology. The system combines home monitoring of physiological and emotional states through a set of wearable sensors, user-controlled (automated) home devices, and a central control for integration of the data, in order to provide a safe and friendly environment according to the limited capabilities of the users. The main objective is to create the easy, low-cost automation of a room or house to provide a friendly environment that enhances the psychological condition of immobilized users. In addition, the complete interaction of the components provides an overview of the physical and emotional state of the user, building a behavior pattern that can be supervised by the care giving staff. This approach allows the integration of physiological signals with the patient’s environmental and social context to obtain a complete framework of the emotional states

    An ongoing review of speech emotion recognition

    Get PDF
    User emotional status recognition is becoming a key feature in advanced Human Computer Interfaces (HCI). A key source of emotional information is the spoken expression, which may be part of the interaction between the human and the machine. Speech emotion recognition (SER) is a very active area of research that involves the application of current machine learning and neural networks tools. This ongoing review covers recent and classical approaches to SER reported in the literature.This work has been carried out with the support of project PID2020-116346GB-I00 funded by the Spanish MICIN
    • …
    corecore