Search CORE

10 research outputs found

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

Author: Lian Hailun
Lu Cheng
Schuller Björn
Zhao Yan
Zheng Wenming
Zong Yuan
Publication venue
Publication date: 18/01/2024
Field of study

In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers. Consequently, when the trained model is confronted with data from new speakers, its performance tends to degrade. To address the issue, we propose a Dynamic Joint Distribution Adaptation (DJDA) method under the framework of multi-source domain adaptation. DJDA firstly utilizes joint distribution adaptation (JDA), involving marginal distribution adaptation (MDA) and conditional distribution adaptation (CDA), to more precisely measure the multi-domain distribution shifts caused by different speakers. This helps eliminate speaker bias in emotion features, allowing for learning discriminative and speaker-invariant speech emotion features from coarse-level to fine-level. Furthermore, we quantify the adaptation contributions of MDA and CDA within JDA by using a dynamic balance factor based on

\mathcal{A}

-Distance, promoting to effectively handle the unknown distributions encountered in data from new speakers. Experimental results demonstrate the superior performance of our DJDA as compared to other state-of-the-art (SOTA) methods.Comment: Accepted by ICASSP 202

arXiv.org e-Print Archive

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Author: Li Xiang
Ou Dan
Tan Jiwei
Wang Chao
Zeng Xiaoyi
Zheng Bo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/03/2020
Field of study

For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from multimodal features is still less investigated, considering an item in E-commerce usually contains multiple heterogeneous modalities. Previous works either concatenate the multiple modality features, that is equivalent to giving a fixed importance weight to each modality; or learn dynamic weights of different modalities for different items through technique like attention mechanism. However, a problem is that there usually exists common redundant information across multiple modalities. The dynamic weights of different modalities computed by using the redundant information may not correctly reflect the different importance of each modality. To address this, we explore the complementarity and redundancy of modalities by considering modality-specific and modality-invariant features differently. We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task. A multimodal attention network first calculates the weights of multiple modalities for each item according to its modality-specific features. Then a multimodal adversarial network learns modality-invariant representations where a double-discriminators strategy is introduced. Finally, we achieve the multimodal item representations by combining both modality-specific and modality-invariant representations. We conduct extensive experiments on both public and industrial datasets, and the proposed method consistently achieves remarkable improvements to the state-of-the-art methods. Moreover, the approach has been deployed in an operational E-commerce system and online A/B testing further demonstrates the effectiveness.Comment: Accepted to WWW 2020, 10 page

arXiv.org e-Print Archive

Crossref

Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources

Author: Guo Jiang
Keutzer Kurt
Krishna Ravi
Xiao Yang
Xu Pengfei
Yang Jufeng
Yue Xiangyu
Zhao Sicheng
Publication venue
Publication date: 17/02/2021
Field of study

Sentiment analysis of user-generated reviews or comments on products and services in social networks can help enterprises to analyze the feedback from customers and take corresponding actions for improvement. To mitigate large-scale annotations on the target domain, domain adaptation (DA) provides an alternate solution by learning a transferable model from other labeled source domains. Existing multi-source domain adaptation (MDA) methods either fail to extract some discriminative features in the target domain that are related to sentiment, neglect the correlations of different sources and the distribution difference among different sub-domains even in the same source, or cannot reflect the varying optimal weighting during different training stages. In this paper, we propose a novel instance-level MDA framework, named curriculum cycle-consistent generative adversarial network (C-CycleGAN), to address the above issues. Specifically, C-CycleGAN consists of three components: (1) pre-trained text encoder which encodes textual input from different domains into a continuous representation space, (2) intermediate domain generator with curriculum instance-level adaptation which bridges the gap across source and target domains, and (3) task classifier trained on the intermediate domain for final sentiment classification. C-CycleGAN transfers source samples at instance-level to an intermediate domain that is closer to the target domain with sentiment semantics preserved and without losing discriminative features. Further, our dynamic instance-level weighting mechanisms can assign the optimal weights to different source samples in each training stage. We conduct extensive experiments on three benchmark datasets and achieve substantial gains over state-of-the-art DA approaches. Our source code is released at: https://github.com/WArushrush/Curriculum-CycleGAN.Comment: Accepted by WWW 202

arXiv.org e-Print Archive

DSpace@MIT

Fault Diagnosis of Transfer Learning Equipment Based on Cloud Edge Collaboration + Confrontation Network

Author: Jiang Lei
Zhang Zhenji
Zou Ping
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

With the continuous improvement of product quality, production efficiency, and complexity, higher requirements are put forward for the reliability and stability of equipment, and the difficulty of real-time diagnosis of faults and functional failures is also increasing. The traditional fault diagnosis methods based on signal processing and Convolutional neural network cannot meet the requirements of on-site online real-time fault diagnosis of equipment. One is that the vibration signals on the industrial site are superimposed on each other, nonlinear and unstable and traditional feature extraction methods take a long time, resulting in unstable extraction results. Second, massive data and fault diagnosis algorithms need rich computing and storage resources. The traditional Convolutional neural network method conflicts with the real-time response requirements of fault diagnosis. At the same time, different models of fault diagnosis models have poor generalization ability, and the diagnostic accuracy is not high or even impossible to diagnose. To solve the above problems, this paper proposes a fault diagnosis method based on industrial Internet platform, which is equipment cloud edge collaboration + adaptive countermeasure network Transfer learning. On the edge side, the vibration signals collected from key components of the model are processed using empirical mode decomposition (EEMD) to solve the problem of signal nonlinearity and stationarity. In the cloud, EEMD signals of different models are decomposed into source domain and target domain for confrontation training, which is used as the input of the improved domain adversarial network model DANN (Domain Adversarial Neural Networks), so as to improve the accuracy of fault diagnosis of different models by using cloud computing power and the improved adversarial network Transfer learning algorithm. Through the analysis of experimental data, this paper verifies that the model after the confrontation network Transfer learning is more accurate than the traditional fault diagnosis method. Through the coordination of computing resources and real-time requirements, real-time diagnosis of cloud side collaborative bearing fault is realized

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Domain Adaptation for Autonomous Driving

Author: Chen Xingxin
Publication venue: 'University of Waterloo'
Publication date: 10/10/2020
Field of study

Metric based method is a promising approach to domain adaptation, which aims to align the marginal distribution of different domains with a similar conditional distribution. The thesis explores applying domain adaptation for autonomous driving and proposes domain adaptation methods for 2D image semantic segmentation and 3D point cloud object detection. For 2D image semantic segmentation domain adaptation, traditional approaches design metric function manually to measure the distance across domains. The adversarial methods can be considered as an automatic learning approach for the metric function. Instead of depending on the quality of metric function, this thesis outlines a generalized framework for domain randomization which first introduces moderate perturbation as randomness and then combines the advantage of metric-based domain adaptation and domain randomization. Then a simple-to-implement training pipeline of this framework is proposed, which proves that the proposed model achieves comparable performance with metric-based methods while having better generalization performance. The proposed Metric Guided Domain Randomization approach is able to improve mean intersection-over-union on the target domain from 16.9 to 27.2 without using any target domain data or annotations. 3D point cloud domain adaptation is an area where researchers pay little attention. A method based on global feature alignment is proposed and experiments show that it has a better performance compared with fine-tuning on target domain data directly when having access to a limited number of target data frames

University of Waterloo's Institutional Repository