1,195 research outputs found

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Perceptual borderline for balancing multi-class spontaneous emotional data

    Get PDF
    Speech is a behavioural biometric signal that can provide important information to understand the human intends as well as their emotional status. The paper is centered on the speech-based identification of the seniors’s emotional status during their interaction with a virtual agent playing the role of a health professional coach. Under real conditions, we can just identify a small set of task-dependent spontaneous emotions. The number of identified samples is largely different for each emotion, which results in an imbalanced dataset problem. This research proposes the dimensional model of emotions as a perceptual representation space alternative to the generally used acoustic one. The main contribution of the paper is the definition of a perceptual borderline for the oversampling of minority emotion classes in this space. This limit, based on arousal and valence criteria, leads to two methods of balancing the data: the Perceptual Borderline oversampling and the Perceptual Borderline SMOTE (Synthetic Minority Oversampling Technique). Both methods are implemented and compared to state-of-the-art approaches of Random oversampling and SMOTE. The experimental evaluation was carried out on three imbalanced datasets of spontaneous emotions acquired in human-machine scenarios in three different cultures: Spain, France and Norway. The emotion recognition results obtained by neural networks classifiers show that the proposed perceptual oversampling methods led to significant improvements when compared with the state-of-the art, for all scenarios and languages.The research presented in this paper is conducted as partof the project EMPATHIC and of the MENHIR MSCAaction that have received funding from the European Union’s Horizon 2020 research and innovation program under grant agreements No 769872 an No 823907 respective

    Real-time head movement tracking through earables in moving vehicles

    Get PDF
    Abstract. The Internet of Things is enabling innovations in the automotive industry by expanding the capabilities of vehicles by connecting them with the cloud. One important application domain is traffic safety, which can benefit from monitoring the driver’s condition to see if they are capable of safely handling the vehicle. By detecting drowsiness, inattentiveness, and distraction of the driver it is possible to react before accidents happen. This thesis explores how accelerometer and gyroscope data collected using earables can be used to classify the orientation of the driver’s head in a moving vehicle. It is found that machine learning algorithms such as Random Forest and K-Nearest Neighbor can be used to reach fairly accurate classifications even without applying any noise reduction to the signal data. Data cleaning and transformation approaches are studied to see how the models could be improved further. This study paves the way for the development of driver monitoring systems capable of reacting to anomalous driving behavior before traffic accidents can happen

    Deep Sensing: Inertial and Ambient Sensing for Activity Context Recognition using Deep Convolutional Neural Networks

    Get PDF
    With the widespread use of embedded sensing capabilities of mobile devices, there has been unprecedented development of context-aware solutions. This allows the proliferation of various intelligent applications, such as those for remote health and lifestyle monitoring, intelligent personalized services, etc. However, activity context recognition based on multivariate time series signals obtained from mobile devices in unconstrained conditions is naturally prone to imbalance class problems. This means that recognition models tend to predict classes with the majority number of samples whilst ignoring classes with the least number of samples, resulting in poor generalization. To address this problem, we propose augmentation of the time series signals from inertial sensors with signals from ambient sensing to train deep convolutional neural network (DCNNs) models. DCNNs provide the characteristics that capture local dependency and scale invariance of these combined sensor signals. Consequently, we developed a DCNN model using only inertial sensor signals and then developed another model that combined signals from both inertial and ambient sensors aiming to investigate the class imbalance problem by improving the performance of the recognition model. Evaluation and analysis of the proposed system using data with imbalanced classes show that the system achieved better recognition accuracy when data from inertial sensors are combined with those from ambient sensors, such as environmental noise level and illumination, with an overall improvement of 5.3% accuracy

    HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition

    Full text link
    Emotion recognition in conversations is challenging due to the multi-modal nature of the emotion expression. We propose a hierarchical cross-attention model (HCAM) approach to multi-modal emotion recognition using a combination of recurrent and co-attention neural network models. The input to the model consists of two modalities, i) audio data, processed through a learnable wav2vec approach and, ii) text data represented using a bidirectional encoder representations from transformers (BERT) model. The audio and text representations are processed using a set of bi-directional recurrent neural network layers with self-attention that converts each utterance in a given conversation to a fixed dimensional embedding. In order to incorporate contextual knowledge and the information across the two modalities, the audio and text embeddings are combined using a co-attention layer that attempts to weigh the utterance level embeddings relevant to the task of emotion recognition. The neural network parameters in the audio layers, text layers as well as the multi-modal co-attention layers, are hierarchically trained for the emotion classification task. We perform experiments on three established datasets namely, IEMOCAP, MELD and CMU-MOSI, where we illustrate that the proposed model improves significantly over other benchmarks and helps achieve state-of-art results on all these datasets.Comment: 11 pages, 6 figure
    • …
    corecore