Search CORE

3,581 research outputs found

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

Author: Cummins Nicholas
Han Jing
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 21/09/2018
Field of study

Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities

arXiv.org e-Print Archive

OPUS Augsburg

Substructure and Boundary Modeling for Continuous Action Recognition

Author: Huang Thomas
Lin Kai-Hsiang
Wang Jinjun
Wang Zhaowen
Xiao Jing
Publication venue
Publication date: 01/01/2012
Field of study

This paper introduces a probabilistic graphical model for continuous action recognition with two novel components: substructure transition model and discriminative boundary model. The first component encodes the sparse and global temporal transition prior between action primitives in state-space model to handle the large spatial-temporal variations within an action class. The second component enforces the action duration constraint in a discriminative way to locate the transition boundaries between actions more accurately. The two components are integrated into a unified graphical structure to enable effective training and inference. Our comprehensive experimental results on both public and in-house datasets show that, with the capability to incorporate additional information that had not been explicitly or efficiently modeled by previous methods, our proposed algorithm achieved significantly improved performance for continuous action recognition.Comment: Detailed version of the CVPR 2012 paper. 15 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

A prototypical network approach for evaluating generated emotional speech

Author: André Elisabeth
Baird Alice
Mertes Silvan
Milling Manuel
Schuller Björn W.
Stappen Lukas
Wiest Thomas
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

OPUS Augsburg

An improved StarGAN for emotional voice conversion: enhancing voice quality and data augmentation

Author: Chen Junjie
He Xiangheng
Rizos Georgios
Schuller Björn W.
Publication venue: 'International Speech Communication Association'
Publication date: 18/07/2021
Field of study

Emotional Voice Conversion (EVC) aims to convert the emotional style of a source speech signal to a target style while preserving its content and speaker identity information. Previous emotional conversion studies do not disentangle emotional information from emotion-independent information that should be preserved, thus transforming it all in a monolithic manner and generating audio of low quality, with linguistic distortions. To address this distortion problem, we propose a novel StarGAN framework along with a two-stage training process that separates emotional features from those independent of emotion by using an autoencoder with two encoders as the generator of the Generative Adversarial Network (GAN). The proposed model achieves favourable results in both the objective evaluation and the subjective evaluation in terms of distortion, which reveals that the proposed model can effectively reduce distortion. Furthermore, in data augmentation experiments for end-to-end speech emotion recognition, the proposed StarGAN model achieves an increase of 2% in Micro-F1 and 5% in Macro-F1 compared to the baseline StarGAN model, which indicates that the proposed model is more valuable for data augmentation.Comment: Accepted by Interspeech 202

arXiv.org e-Print Archive

OPUS Augsburg