29,263 research outputs found
Strategic dialogue management via deep reinforcement learning
Artificially intelligent agents equipped with strategic skills that can negotiate during their interactions with other natural or artificial agents are still underdeveloped. This paper describes a successful application of Deep Reinforcement Learning (DRL) for training intelligent agents with strategic conversational skills, in a situated dialogue setting. Previous studies have modelled the behaviour of strategic agents using supervised learning and traditional reinforcement learning techniques, the latter using tabular representations or learning with linear function approximation. In this study, we apply DRL with a high-dimensional state space to the strategic board game of Settlers of Catan---where players can offer resources in exchange for others and they can also reply to offers made by other players. Our experimental results report that the DRL-based learnt policies significantly outperformed several baselines including random, rule-based, and supervised-based behaviours. The DRL-based policy has a 53% win rate versus 3 automated players (`bots'), whereas a supervised player trained on a dialogue corpus in this setting achieved only 27%, versus the same 3 bots. This result supports the claim that DRL is a promising framework for training dialogue systems, and strategic agents with negotiation abilities
Reinforcement learning for trading dialogue agents in non-cooperative negotiations
Recent advances in automating Dialogue Management have been mainly made in
cooperative environments -where the dialogue system tries to help a human to meet
their goals. In non-cooperative environments though, such as competitive trading,
there is still much work to be done. The complexity of such an environment rises
as there is usually imperfect information about the interlocutors’ goals and states.
The thesis shows that non-cooperative dialogue agents are capable of learning how
to successfully negotiate in a variety of trading-game settings, using Reinforcement
Learning, and results are presented from testing the trained dialogue policies with
humans. The agents learned when and how to manipulate using dialogue, how to
judge the decisions of their rivals, how much information they should expose, as well
as how to effectively map the adversarial needs in order to predict and exploit their
actions. Initially the environment was a two-player trading game (“Taikun”). The
agent learned how to use explicit linguistic manipulation, even with risks of exposure
(detection) where severe penalties apply. A more complex opponent model for
adversaries was also implemented, where we modelled all trading dialogue moves as
implicitly manipulating the adversary’s opponent model, and we worked in a more
complex game (“Catan”). In that multi-agent environment we show that agents
can learn to be legitimately persuasive or deceitful. Agents which learned how to
manipulate opponents using dialogue are more successful than ones which do not
manipulate. We also demonstrate that trading dialogues are more successful when
the learning agent builds an estimate of the adversarial hidden goals and preferences.
Furthermore the thesis shows that policies trained in bilateral negotiations can be
very effective in multilateral ones (i.e. the 4-player version of Catan). The findings
suggest that it is possible to train non-cooperative dialogue agents which successfully
trade using linguistic manipulation. Such non-cooperative agents may have
important future applications, such as on automated debating, police investigation,
games, and education
Towards Integration of Cognitive Models in Dialogue Management: Designing the Virtual Negotiation Coach Application
This paper presents an approach to flexible and adaptive dialogue management driven by cognitive modelling of human dialogue behaviour. Artificial intelligent agents, based on the ACT-R cognitive architecture, together with human actors are participating in a (meta)cognitive skills training within a negotiation scenario. The agent employs instance-based learning to decide about its own actions and to reflect on the behaviour of the opponent. We show that task-related actions can be handled by a cognitive agent who is a plausible dialogue partner. Separating task-related and dialogue control actions enables the application of sophisticated models along with a flexible architecture in which various alternative modelling methods can be combined. We evaluated the proposed approach with users assessing the relative contribution of various factors to the overall usability of a dialogue system. Subjective perception of effectiveness, efficiency and satisfaction were correlated with various objective performance metrics, e.g. number of (in)appropriate system responses, recovery strategies, and interaction pace. It was observed that the dialogue system usability is determined most by the quality of agreements reached in terms of estimated Pareto optimality, by the user's negotiation strategies selected, and by the quality of system recognition, interpretation and responses. We compared human-human and human-agent performance with respect to the number and quality of agreements reached, estimated cooperativeness level, and frequency of accepted negative outcomes. Evaluation experiments showed promising, consistently positive results throughout the range of the relevant scales
- …