8 research outputs found

    A Spanish Corpus for Talking to the Elderly

    Get PDF
    Paper presented at 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020; Madrid; Spain; 21 September 2020 through 23 September 2020In this work, a Spanish corpus that was developed, within the EMPATHIC project (http://www.empathic-project.eu/) framework, is presented. It was designed for building a dialogue system capable of talking to elderly people and promoting healthy habits, through a coaching model. The corpus, that comprises audio, video an text channels, was acquired by using a Wizard of Oz strategy. It was annotated in terms of different labels according to the different models that are needed in a dialogue system, including an emotion based annotation that will be used to generate empathetic system reactions. The annotation at different levels along with the employed procedure are described and analysed

    A Study on Dialogue Reward Prediction for Open-Ended Conversational Agents

    Get PDF
    The amount of dialogue history to include in a conversational agent is often underestimated and/or set in an empirical and thus possibly naive way. This suggests that principled investigations into optimal context windows are urgently needed given that the amount of dialogue history and corresponding representations can play an important role in the overall performance of a conversational system. This paper studies the amount of history required by conversational agents for reliably predicting dialogue rewards. The task of dialogue reward prediction is chosen for investigating the effects of varying amounts of dialogue history and their impact on system performance. Experimental results using a dataset of 18K human-human dialogues report that lengthy dialogue histories of at least 10 sentences are preferred (25 sentences being the best in our experiments) over short ones, and that lengthy histories are useful for training dialogue reward predictors with strong positive correlations between target dialogue rewards and predicted ones

    Analysis and automatic identification of spontaneous emotions in speech from human-human and human-machine communication

    Get PDF
    383 p.This research mainly focuses on improving our understanding of human-human and human-machineinteractions by analysing paricipantsÂż emotional status. For this purpose, we have developed andenhanced Speech Emotion Recognition (SER) systems for both interactions in real-life scenarios,explicitly emphasising the Spanish language. In this framework, we have conducted an in-depth analysisof how humans express emotions using speech when communicating with other persons or machines inactual situations. Thus, we have analysed and studied the way in which emotional information isexpressed in a variety of true-to-life environments, which is a crucial aspect for the development of SERsystems. This study aimed to comprehensively understand the challenge we wanted to address:identifying emotional information on speech using machine learning technologies. Neural networks havebeen demonstrated to be adequate tools for identifying events in speech and language. Most of themaimed to make local comparisons between some specific aspects; thus, the experimental conditions weretailored to each particular analysis. The experiments across different articles (from P1 to P19) are hardlycomparable due to our continuous learning of dealing with the difficult task of identifying emotions inspeech. In order to make a fair comparison, additional unpublished results are presented in the Appendix.These experiments were carried out under identical and rigorous conditions. This general comparisonoffers an overview of the advantages and disadvantages of the different methodologies for the automaticrecognition of emotions in speech

    Reward estimation for dialogue policy optimisation

    No full text
    Viewing dialogue management as a reinforcement learning task enables a system to learn to act optimally by maximising a reward function. This reward function is designed to induce the system behaviour required for the target application and for goal-oriented applications, this usually means fulfilling the user's goal as efficiently as possible. However, in real-world spoken dialogue system applications, the reward is hard to measure because the user's goal is frequently known only to the user. Of course, the system can ask the user if the goal has been satisfied but this can be intrusive. Furthermore, in practice, the accuracy of the user's response has been found to be highly variable. This paper presents two approaches to tackling this problem. Firstly, a recurrent neural network is utilised as a task success predictor which is pre-trained from off-line data to estimate task success during subsequent on-line dialogue policy learning. Secondly, an on-line learning framework is described whereby a dialogue policy is jointly trained alongside a reward function modelled as a Gaussian process with active learning. This Gaussian process operates on a fixed dimension embedding which encodes each varying length dialogue. This dialogue embedding is generated in both a supervised and unsupervised fashion using different variants of a recurrent neural network. The experimental results demonstrate the effectiveness of both off-line and on-line methods. These methods enable practical on-line training of dialogue policies in real-world applications
    corecore