1,859 research outputs found

    Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

    Get PDF
    Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities

    A review of domain adaptation without target labels

    Full text link
    Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

    A Bayesian predictive classification approach to robust speech recognition

    Get PDF
    We introduce a new Bayesian predictive classification (BPC) approach to robust speech recognition and apply the BPC framework to Gaussian mixture continuous density hidden Markov model based speech recognition. We propose and focus on one of the approximate BPC approaches called quasi-Bayesian predictive classification (QBPC). In comparison with the standard plug-in maximum a posteriori decoding, when the QBPC method is applied to speaker independent recognition of a confusable vocabulary namely 26 English letters, where a broad range of mismatches between training and testing conditions exist, the QBPC achieves around 14% relative recognition error rate reduction. While the QBPC method is applied to cross-gender testing on a less confusable vocabulary, namely 20 English digits and commands, the QBPC method achieves around 24% relative recognition error rate reduction.published_or_final_versio

    Robust speech recognition based on a Bayesian prediction approach

    Get PDF
    We study a category of robust speech recognition problem in which mismatches exist between training and testing conditions, and no accurate knowledge of the mismatch mechanism is available. The only available information is the test data along with a set of pretrained Gaussian mixture continuous density hidden Markov models (CDHMMs). We investigate the problem from the viewpoint of Bayesian prediction. A simple prior distribution, namely constrained uniform distribution, is adopted to characterize the uncertainty of the mean vectors of the CDHMMs. Two methods, namely a model compensation technique based on Bayesian predictive density and a robust decision strategy called Viterbi Bayesian predictive classification are studied. The proposed methods are compared with the conventional Viterbi decoding algorithm in speaker-independent recognition experiments on isolated digits and TI connected digit strings (TIDTGITS), where the mismatches between training and testing conditions are caused by: (1) additive Gaussian white noise, (2) each of 25 types of actual additive ambient noises, and (3) gender difference. The experimental results show that the adopted prior distribution and the proposed techniques help to improve the performance robustness under the examined mismatch conditions.published_or_final_versio

    On adaptive decision rules and decision parameter adaptation for automatic speech recognition

    Get PDF
    Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing.published_or_final_versio

    Semi-Supervised Single- and Multi-Domain Regression with Multi-Domain Training

    Full text link
    We address the problems of multi-domain and single-domain regression based on distinct and unpaired labeled training sets for each of the domains and a large unlabeled training set from all domains. We formulate these problems as a Bayesian estimation with partial knowledge of statistical relations. We propose a worst-case design strategy and study the resulting estimators. Our analysis explicitly accounts for the cardinality of the labeled sets and includes the special cases in which one of the labeled sets is very large or, in the other extreme, completely missing. We demonstrate our estimators in the context of removing expressions from facial images and in the context of audio-visual word recognition, and provide comparisons to several recently proposed multi-modal learning algorithms.Comment: 24 pages, 6 figures, 2 table
    • …
    corecore