18 research outputs found
When to Say What and How: Adapting the Elaborateness and Indirectness of Spoken Dialogue Systems
With the aim of designing a spoken dialogue system which has the ability to adapt to the user's communication idiosyncrasies, we investigate whether it is possible to carry over insights from the usage of communication styles in human-human interaction to human-computer interaction. In an extensive literature review, it is demonstrated that communication styles play an important role in human communication. Using a multi-lingual data set, we show that there is a significant correlation between the communication style of the system and the preceding communication style of the user. This is why two components that extend the standard architecture of spoken dialogue systems are presented: 1) a communication style classifier that automatically identifies the user communication style and 2) a communication style selection module that selects an appropriate system communication style. We consider the communication styles elaborateness and indirectness as it has been shown that they influence the user's satisfaction and the user's perception of a dialogue. We present a neural classification approach based on supervised learning for each task. Neural networks are trained and evaluated with features that can be automatically derived during an ongoing interaction in every spoken dialogue system. It is shown that both components yield solid results and outperform the baseline in form of a majority-class classifier
Communication style modelling and adaptation in spoken dialogue systems
When communicating, people use strategies while choosing the words and the non-verbal signs in order to achieve their purpose. Hence, they do not only focus on what they say, but also on how they formulate it. The aim of this thesis is to examine the role of communication styles in human-computer interaction. This is approached from two angles: it is investigated how varying communication styles are perceived by the user and how communication styles can be integrated into spoken dialogue systems. In order to answer the first question, system requirements are analysed in a series of complex prototypes and various user evaluations are conducted to examine different user groups in diverse scenarios. The second aspect is addressed through the implementation of two new components for spoken dialogue systems.
There are numerous parameters that influence an interaction between two participants and the appropriate or preferred communication style, like the speakers’ roles, their cultures, their personalities or the aim of the interaction. In order to enable adaptation through communication styles, these different aspects need to be set into relation. Therefore, the Communication Style Perception Model is presented within the scope of this thesis. It covers elements that are relevant for the selection of the system communication style as well as aspects that are influenced by the system communication style. It is based on numerous user evaluations, examining various user groups in different scenarios. Three experiments investigate the influence of numerous variables on the user's preference in the system communication style. The results show that both user traits and system properties influence the user's communication style preferences in human-computer interaction. Further experiments investigate how varying system communication styles affect the users, if they are selected according to the users' personal preferences. To examine this, different communication styles are included in various systems and applications. The results show that the system's communication style influences the user's satisfaction and the user's perception of the dialogue. For specific applications like behaviour change support systems, the communication style even has an impact on the user's behaviour. Furthermore, the results show that there is no general preference in the system's communication style. The preference appears to be individual for every person and the system needs to adapt its communication style to each user individually during every dialogue.
The second question of how communication styles can be integrated into spoken dialogue systems is addressed by the extension of the standard architecture of spoken dialogue systems. Two new components are proposed, implemented and evaluated: a communication style classifier that automatically identifies the user communication style and a communication style selection module that selects an appropriate system communication style. Both tasks are formulated as classification problems. Due to the novelty of the underlying machine learning task, a multi-lingual corpus is created, containing 258 dialogues with annotations for the elaborateness and indirectness for each of the 7,930 dialogue acts. For the user communication style recognition, three different classifiers are compared on the task: a support vector machine classifier, a multi-layer perceptron classifier, and a custom recurrent neural network classifier. Furthermore, different feature sets are tested as input for the classifiers. All features that are used for the communication style classification can be automatically recognised in spoken dialogue systems during an ongoing interaction, without any prior annotation. The results show that for the elaborateness, analysing the utterance length dependent on the dialogue act contains enough information to achieve good classification performance. The indirectness seems to be a more difficult classification task and additional linguistic features in form of word embeddings give improvement in the classification results. Furthermore, temporal information is beneficial in this case. For the system communication style selection, a multi-layer perceptron classifier is trained and evaluated, using features that encode what the system wants to say in the current turn, what the user wants from the system and how the user talks to the system. As for the first task, the features can be automatically recognised in spoken dialogue systems. The results outperform both a majority-class classifier and a baseline which is mimicking the last user communication style for each of the four languages. When combining both components, the spoken dialogue system is enabled to recognise the user's communication style and select an appropriate communication style for the system
IQ-adaptive statistical dialogue management using Gaussian processes
Adapting a Spoken Dialogue System to the user's satisfaction is supposed to result in more successful dialogues. In this thesis, Gaussian processes are used to model a policy for a statistical Spoken Dialogue System and the Interaction Quality (IQ) metric which is a measure for the user's satisfaction is used to train this policy. As the policy decides which actions are taken next at a particular point, the dialogue flow is thus adapted to the IQ. Afterwards, it is investigated whether the incorporation of the IQ metric is beneficial. Therefore, different learning strategies with and without the IQ metric are used to train different policies. Then, the performance of all trained policies is evaluated regarding dialogue completion, task success, the average length of a dialogue and the average IQ value at the end of a dialogue
Adaptive dialogue management for a script knowledge based conversational assistant
The spoken dialogue systems today already fulfill many requirements, and their human-machine interaction works very well. Nevertheless, the users still adapt their communication style to these system. In order to develop appropriate communication style strategies for a spoken dialogue system, we have investigated the communication style factors elaborateness and directness.
Within this work, an indoor navigation spoken dialogue system was developed, which has different communication style strategies and states of knowledge implemented. Moreover, a formula for the adaptation of the system's communication style to the user's communication style was developed. The system navigates the user through the University of Ulm based on scripts. In order to create these scripts, a data collection study was carried out in which 97 participants described the selected routes with the help of videos, which have been recorded beforehand. Finally, the system was evaluated in a user study with 30 participants.
In summary, the system performed good, but we could find no significant difference between the investigated communication styles. However, the different knowledge states of the system revealed some significant differences. Overall, we were able to demonstrate the potential of the indoor navigation application and gained new insights for the communication styles and the interaction with a non-omniscient system
Automatic modification of communication style in dialogue management
Comunicació presentada a: INLG 2016 Workshop on Computational Creativity and Natural Language Generation, celebrat a Edinburgh, Escòcia, del 5 al 8 de setembre de 2016.In task-oriented dialogues, there is often only one right answer the system can give. However, a lack of variation can seem repetitive and unnatural. Humans change the way they express something, e.g. by being more or less concise. We aim to approximate this ability by automatically varying the level of verbosity and directness of a given system action. In this work, we illustrate how verbosity and directness may be utilised in adaptive dialogue management and present different approaches to automatically generate varying levels of verbosity and directness for given system actions. Thereby, new and unforeseen system actions can be created dynamically.This paper is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 645012
A social companion and conversation partner for elderly
Comunicació presentada a: 8th International Workshop On Spoken Dialogue Systems (IWSDS), celebrat del 6 al 9 de juny a Farmington, Estats Units.In this work, we present the development and evaluation of a social companion and conversation partner for the special user group of elderly.With the aim of designing a user-adaptive system, we responded to the desires of the elderly which have been identified during various interviews and created a companion which talks and listens to the elderly users. Moreover, we conducted a user study with a small group of retired seniors living at home or in a nursing home. The results show that our companion and its dialogue were perceived very positively and that a social companion and conversation partner is indeed in demand by lonely seniors.This work is part of a project that has received funding from the European
Union’s Horizon 2020 research and innovation programme under grant agreement No 645012
The next step: intelligent digital assistance for clinical operating rooms
With the emergence of new technologies, the surgical working environment becomes increasingly complex and comprises many medical devices that have to be taken cared of. However, the goal is to reduce the workload of the surgical team to allow them to fully focus on the actual surgical procedure. Therefore, new strategies are needed to keep the working environment manageable. Existing research projects in the field of intelligent medical environments mostly concentrate on workflow modeling or single smart features rather than building up a complete intelligent environment. In this article, we present the concept of intelligent digital assistance for clinical operating rooms (IDACO), providing the surgeon assistance in many different situations before and during an ongoing procedure using natural spoken language. The speech interface enables the surgeon to concentrate on the surgery and control the technical environment at the same time, without taking care of how to interact with the system. Furthermore, the system observes the context of the surgery and controls several devices autonomously at the appropriate time during the procedure
Automatic modification of communication style in dialogue management
Comunicació presentada a: INLG 2016 Workshop on Computational Creativity and Natural Language Generation, celebrat a Edinburgh, Escòcia, del 5 al 8 de setembre de 2016.In task-oriented dialogues, there is often only one right answer the system can give. However, a lack of variation can seem repetitive and unnatural. Humans change the way they express something, e.g. by being more or less concise. We aim to approximate this ability by automatically varying the level of verbosity and directness of a given system action. In this
work, we illustrate how verbosity and directness may be utilised in adaptive dialogue management and present different approaches to automatically generate varying levels of verbosity and directness for given system actions. Thereby, new and unforeseen system actions can be created dynamically.This paper is part of a project that has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 645012
Estimating User Communication Styles for Spoken Dialogue Systems: Data
We present a neural network approach to estimate the communication style of spoken interaction, namely the stylistic variations elaborateness and directness, and investigate which type of input features to the estimator are necessary to achive good performance. First, we describe our annotated corpus of recordings in the health care domain and analyse the corpus statistics in terms of agreement, correlation and reliability of the ratings. We use this corpus to estimate the elaborateness and the directness of each utterance. We test different feature sets consisting of dialogue act features, grammatical features and linguistic features as input for our classifier and perform classification in two and three classes. Our classifiers use only features that can be automatically derived during an ongoing interaction in any spoken dialogue system without any prior annotation. Our results show that the elaborateness can be classified by only using the dialogue act and the amount of words contained in the corresponding utterance. The directness is a more difficult classification task and additional linguistic features in form of word embeddings improve the classification results. Afterwards, we run a comparison with a support vector machine and a recurrent neural network classifier