86 research outputs found

    Examining the Size of the Latent Space of Convolutional Variational Autoencoders Trained With Spectral Topographic Maps of EEG Frequency Bands

    Get PDF
    Electroencephalography (EEG) is a technique of recording brain electrical potentials using electrodes placed on the scalp [1]. It is well known that EEG signals contain essential information in the frequency, temporal and spatial domains. For example, some studies have converted EEG signals into topographic power head maps to preserve spatial information [2]. Others have produced spectral topographic head maps of different EEG bands to both preserve information in The associate editor coordinating the review of this manuscript and approving it for publication was Ludovico Minati . the spatial domain and take advantage of the information in the frequency domain [3]. However, topographic maps contain highly interpolated data in between electrode locations and are often redundant. For this reason, convolutional neural networks are often used to reduce their dimensionality and learn relevant features automatically [4]

    Active labeling in deep learning and its application to emotion prediction

    Get PDF
    Recent breakthroughs in deep learning have made possible the learning of deep layered hierarchical representations of sensory input. Stacked restricted Boltzmann machines (RBMs), also called deep belief networks (DBNs), and stacked autoencoders are two representative deep learning methods. The key idea is greedy layer-wise unsupervised pre-training followed by supervised fine-tuning, which can be done efficiently and overcomes the difficulty of local minima when training all layers of a deep neural network at once. Deep learning has been shown to achieve outstanding performance in a number of challenging real-world applications. Existing deep learning methods involve a large number of meta-parameters, such as the number of hidden layers, the number of hidden nodes, the sparsity target, the initial values of weights, the type of units, the learning rate, etc. Existing applications usually do not explain why the decisions were made and how changes would affect performance. Thus, it is difficult for a novice user to make good decisions for a new application in order to achieve good performance. In addition, most of the existing works are done on simple and clean datasets and assume a fixed set of labeled data, which is not necessarily true for real-world applications. The main objectives of this dissertation are to investigate the optimal meta-parameters of deep learning networks as well as the effects of various data pre-processing techniques, propose a new active labeling framework for cost-effective selection of labeled data, and apply deep learning to a real-world application--emotion prediction via physiological sensor data, based on real-world, complex, noisy, and heterogeneous sensor data. For meta-parameters and data pre-processing techniques, this study uses the benchmark MNIST digit recognition image dataset and a sleep-stage-recognition sensor dataset and empirically compares the deep network's performance with a number of different meta-parameters and decisions, including raw data vs. pre-processed data by Principal Component Analysis (PCA) with or without whitening, various structures in terms of the number of layers and the number of nodes in each layer, stacked RBMs vs. stacked autoencoders. For active labeling, a new framework for both stacked RBMs and stacked autoencoders is proposed based on three metrics: least confidence, margin sampling, and entropy. On the MINIST dataset, the methods outperform random labeling consistently by a significant margin. On the other hand, the proposed active labeling methods perform similarly to random labeling on the sleep-stage-recognition dataset due to the noisiness and inconsistency in the data. For the application of deep learning to emotion prediction via physiological sensor data, a software pipeline has been developed. The system first extracts features from the raw data of four channels in an unsupervised fashion and then builds three classifiers to classify the levels of arousal, valence, and liking based on the learned features. The classification accuracy is 0.609, 0.512, and 0.684, respectively, which is comparable with existing methods based on expert designed features.Includes bibliographical references (pages 80-86)

    Classifying multi-level stress responses from brain cortical EEG in Nurses and Non-health professionals using Machine Learning Auto Encoder

    Get PDF
    ObjectiveMental stress is a major problem in our society and has become an area of interest for many psychiatric researchers. One primary research focus area is the identification of bio-markers that not only identify stress but also predict the conditions (or tasks) that cause stress. Electroencephalograms (EEGs) have been used for a long time to study and identify bio-markers. While these bio-markers have successfully predicted stress in EEG studies for binary conditions, their performance is suboptimal for multiple conditions of stress.MethodsTo overcome this challenge, we propose using latent based representations of the bio-markers, which have been shown to significantly improve EEG performance compared to traditional bio-markers alone. We evaluated three commonly used EEG based bio-markers for stress, the brain load index (BLI), the spectral power values of EEG frequency bands (alpha, beta and theta), and the relative gamma (RG), with their respective latent representations using four commonly used classifiers.ResultsThe results show that spectral power value based bio-markers had a high performance with an accuracy of 83%, while the respective latent representations had an accuracy of 91%

    Investigating the use of pretrained convolutional neural network on cross-subject and cross-dataset EEG emotion recognition

    Get PDF
    The electroencephalogram (EEG) has great attraction in emotion recognition studies due to its resistance to deceptive actions of humans. This is one of the most significant advantages of brain signals in comparison to visual or speech signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that EEG recordings exhibit varying distributions for different people as well as for the same person at different time instances. This nonstationary nature of EEG limits the accuracy of it when subject independency is the priority. The aim of this study is to increase the subject-independent recognition accuracy by exploiting pretrained state-of-the-art Convolutional Neural Network (CNN) architectures. Unlike similar studies that extract spectral band power features from the EEG readings, raw EEG data is used in our study after applying windowing, pre-adjustments and normalization. Removing manual feature extraction from the training system overcomes the risk of eliminating hidden features in the raw data and helps leverage the deep neural network’s power in uncovering unknown features. To improve the classification accuracy further, a median filter is used to eliminate the false detections along a prediction interval of emotions. This method yields a mean cross-subject accuracy of 86.56% and 78.34% on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED) for two and three emotion classes, respectively. It also yields a mean cross-subject accuracy of 72.81% on the Database for Emotion Analysis using Physiological Signals (DEAP) and 81.8% on the Loughborough University Multimodal Emotion Dataset (LUMED) for two emotion classes. Furthermore, the recognition model that has been trained using the SEED dataset was tested with the DEAP dataset, which yields a mean prediction accuracy of 58.1% across all subjects and emotion classes. Results show that in terms of classification accuracy, the proposed approach is superior to, or on par with, the reference subject-independent EEG emotion recognition studies identified in literature and has limited complexity due to the elimination of the need for feature extraction.<br

    Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals

    Full text link
    In this paper, we address the problem of multimodal emotion recognition from multiple physiological signals. We demonstrate that a Transformer-based approach is suitable for this task. In addition, we present how such models may be pretrained in a multimodal scenario to improve emotion recognition performances. We evaluate the benefits of using multimodal inputs and pre-training with our approach on a state-ofthe-art dataset

    EEGFuseNet: Hybrid Unsupervised Deep Feature Characterization and Fusion for High-Dimensional EEG With an Application to Emotion Recognition

    Get PDF
    How to effectively and efficiently extract valid and reliable features from high-dimensional electroencephalography (EEG), particularly how to fuse the spatial and temporal dynamic brain information into a better feature representation, is a critical issue in brain data analysis. Most current EEG studies work in a task driven manner and explore the valid EEG features with a supervised model, which would be limited by the given labels to a great extent. In this paper, we propose a practical hybrid unsupervised deep convolutional recurrent generative adversarial network based EEG feature characterization and fusion model, which is termed as EEGFuseNet. EEGFuseNet is trained in an unsupervised manner, and deep EEG features covering both spatial and temporal dynamics are automatically characterized. Comparing to the existing features, the characterized deep EEG features could be considered to be more generic and independent of any specific EEG task. The performance of the extracted deep and low-dimensional features by EEGFuseNet is carefully evaluated in an unsupervised emotion recognition application based on three public emotion databases. The results demonstrate the proposed EEGFuseNet is a robust and reliable model, which is easy to train and performs efficiently in the representation and fusion of dynamic EEG features. In particular, EEGFuseNet is established as an optimal unsupervised fusion model with promising cross-subject emotion recognition performance. It proves EEGFuseNet is capable of characterizing and fusing deep features that imply comparative cortical dynamic significance corresponding to the changing of different emotion states, and also demonstrates the possibility of realizing EEG based cross-subject emotion recognition in a pure unsupervised manner
    corecore