4 research outputs found
Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers
Despite the recent progress in speech emotion recognition (SER),
state-of-the-art systems are unable to achieve improved performance in
cross-language settings. In this paper, we propose a Multimodal Dual Attention
Transformer (MDAT) model to improve cross-language SER. Our model utilises
pre-trained models for multimodal feature extraction and is equipped with a
dual attention mechanism including graph attention and co-attention to capture
complex dependencies across different modalities and achieve improved
cross-language SER results using minimal target language data. In addition, our
model also exploits a transformer encoder layer for high-level feature
representation to improve emotion classification accuracy. In this way, MDAT
performs refinement of feature representation at various stages and provides
emotional salient features to the classification layer. This novel approach
also ensures the preservation of modality-specific emotional information while
enhancing cross-modality and cross-language interactions. We assess our model's
performance on four publicly available SER datasets and establish its superior
effectiveness compared to recent approaches and baseline models.Comment: Under Review IEEE TM
Bimodal Emotion Classification Using Deep Learning
Multimodal Emotion Recognition is an emerging associative field in the area of Human Computer Interaction and Sentiment Analysis. It extracts information from each modality to predict the emotions accurately. In this research, Bimodal Emotion Recognition framework is developed with the decision-level fusion of Audio and Video modality using RAVDES dataset. Designing such frameworks are computationally expensive and require more time to train the network. Thus, a relatively small dataset has been used for the scope of this research. The conducted research is inspired by the use of neural networks for emotion classification from multimodal data. The developed framework further confirmed the fact that merging modality can enhance the accuracy in classifying emotions. Later, decision-level fusion is further explored with changes in the architecture of the Unimodal networks. The research showed that the Bimodal framework formed with the fusion of unimodal networks having wide layer with more nodes outperformed the framework designed with the fusion of narrow unimodal networks having lesser nodes
Recommended from our members
A novel efficient energy optimization in smart urban buildings based on optimal demand side management
Data availability:
The data used for this research and prepatation of this article can be accessed from Brunel University of London repository at: https://doi.org/10.17633/rd.brunel.26049436.v1.Increasing electrical energy consumption during peak hours leads to increased electrical energy losses and the spread of environmental pollution. For this reason, demand-side management programs have been introduced to reduce consumption during peak hours. This study proposes an efficient energy optimization in Smart Urban Buildings (SUBs) based on Improved Sine Cosine Algorithm (ISCA) that uses the load-shifting technique for demand-side management as a way to improve the energy consumption patterns of a SUBs. The proposed system's goal is to optimize the energy of SUBs appliances in order to effectively regulate load demand, with the end result being a reduction in the peak to average ratio (PAR) and a consequent minimization of electricity costs. This is accomplished while also keeping user comfort as a priority. The proposed system is evaluated by comparing it with the Grasshopper Optimization Algorithm (GOA) and unscheduled cases. Without applying an optimization algorithm, the total electricity cost, carbon emission, PAR and waiting time are equal to 1703.576 ID, 34.16664 (kW), and 413.5864s respectively for RTP. While, after applying GOA, the total electricity cost, carbon emission, PAR and waiting time are improved to 1469.72 ID, 21.17 (kW), and 355.772s respectively for RTP. While, after applying the ISCA Improves the total electricity cost, PAR, and waiting time by 1206.748 ID, 16.5648 (kW), and 268.525384s respectively. Where after applying GOA, the total electricity cost, PAR, and waiting time are improved to 13.72 %, 38.00 %, and 13.97 % respectively. And after applying proposed method, the total electricity cost, PAR, and waiting time are improved to 29.16 %, 51.51 %, and 35.07 % respectively. According to the results, the created ISCA algorithm performed better than the unscheduled case and GOA scheduling situations in terms of the stated objectives and was advantageous to both utilities and consumers. Furthermore, this study has presented a novel two-stage stochastic model based on Moth-Flame Optimization Algorithm (MFOA) for the co-optimization of energy scheduling and capacity planning for systems of energy storage that would be incorporated to grid connected smart urban buildings.The research has been partially supported by the Faculty of Informatics and Management UHK excellence project “Methodological perspectives on modeling and simulation of hard and soft systems”