Search CORE

4,450 research outputs found

Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies

Author: Claire Nelson
David Traum
Kallirroi Georgila
Publication venue
Publication date: 11/04/2020
Field of study

Abstract We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbing (PHC-WoLF) algorithms, varying the scenario complexity (state space size), the number of training episodes, the learning rate, and the exploration rate. Our results show that generally Q-learning fails to converge whereas PHC and PHC-WoLF always converge and perform similarly. We also show that very high gradually decreasing exploration rates are required for convergence. We conclude that multiagent RL of dialogue policies is a promising alternative to using single-agent RL and SUs or learning directly from corpora

CiteSeerX

A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation

Author: Alrayes Bedour
Bagga Pallavi
Paoletti Nicola
Stathis Kostas
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 03/02/2020
Field of study

We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning-based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings

arXiv.org e-Print Archive

Crossref

Royal Holloway - Pure

Towards Integration of Cognitive Models in Dialogue Management: Designing the Virtual Negotiation Coach Application

Author: Bunt Harry
Malchanau Andrei
Petukhova Volha
Publication venue: University of Illinois at Chicago Library
Publication date: 04/01/2019
Field of study

This paper presents an approach to flexible and adaptive dialogue management driven by cognitive modelling of human dialogue behaviour. Artificial intelligent agents, based on the ACT-R cognitive architecture, together with human actors are participating in a (meta)cognitive skills training within a negotiation scenario. The agent  employs instance-based learning to decide about its own actions and to reflect on the behaviour of the opponent. We show that task-related actions can be handled by a cognitive agent who is a plausible dialogue partner.  Separating task-related and dialogue control actions enables the application of sophisticated models along with a flexible architecture  in which  various alternative modelling methods can be combined. We evaluated the proposed approach with users assessing  the relative contribution of various factors to the overall usability of a dialogue system. Subjective perception of effectiveness, efficiency and satisfaction were correlated with various objective performance metrics, e.g. number of (in)appropriate system responses, recovery strategies, and interaction pace. It was observed that the dialogue system usability is determined most by the quality of agreements reached in terms of estimated Pareto optimality, by the user's negotiation strategies selected, and by the quality of system recognition, interpretation and responses. We compared human-human and human-agent performance with respect to the number and quality of agreements reached, estimated cooperativeness level, and frequency of accepted negative outcomes. Evaluation experiments showed promising, consistently positive results throughout the range of the relevant scales

University of Illinois at Chicago: Journals@UIC

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Reinforcement learning for trading dialogue agents in non-cooperative negotiations

Author: Efstathiou Ioannis
Publication venue: Mathematical and Computer Sciences
Publication date: 01/05/2016
Field of study

Recent advances in automating Dialogue Management have been mainly made in cooperative environments -where the dialogue system tries to help a human to meet their goals. In non-cooperative environments though, such as competitive trading, there is still much work to be done. The complexity of such an environment rises as there is usually imperfect information about the interlocutors’ goals and states. The thesis shows that non-cooperative dialogue agents are capable of learning how to successfully negotiate in a variety of trading-game settings, using Reinforcement Learning, and results are presented from testing the trained dialogue policies with humans. The agents learned when and how to manipulate using dialogue, how to judge the decisions of their rivals, how much information they should expose, as well as how to effectively map the adversarial needs in order to predict and exploit their actions. Initially the environment was a two-player trading game (“Taikun”). The agent learned how to use explicit linguistic manipulation, even with risks of exposure (detection) where severe penalties apply. A more complex opponent model for adversaries was also implemented, where we modelled all trading dialogue moves as implicitly manipulating the adversary’s opponent model, and we worked in a more complex game (“Catan”). In that multi-agent environment we show that agents can learn to be legitimately persuasive or deceitful. Agents which learned how to manipulate opponents using dialogue are more successful than ones which do not manipulate. We also demonstrate that trading dialogues are more successful when the learning agent builds an estimate of the adversarial hidden goals and preferences. Furthermore the thesis shows that policies trained in bilateral negotiations can be very effective in multilateral ones (i.e. the 4-player version of Catan). The findings suggest that it is possible to train non-cooperative dialogue agents which successfully trade using linguistic manipulation. Such non-cooperative agents may have important future applications, such as on automated debating, police investigation, games, and education

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Reinforcement Learning for Argumentation

Author: ALAHMARI SULTAN
Publication venue: University of York
Publication date: 01/10/2019
Field of study

Argumentation as a logical reasoning approach plays an important role in improving communication, increasing agree-ability, and resolving conflicts in multi-agent-systems (MAS). The present research aims to explore the effectiveness of argumentation in reinforcement learning of intelligent agents in terms of, outperforming baseline agents, learning transfer between argument graphs, and improving relevance and coherence of dialogue quality. This research developed `ARGUMENTO+' to encourage a reinforcement learning agent (RL agent) playing abstract argument game for improving performance against different baseline agents by using a newly proposed state representation in order to make each state unique. When attempting to generalise this approach to other argumentation graphs, the RL agent was not able to effectively identify the argument patterns that are transferable to other domains. In order to improve the effectiveness of the RL agent to recognise argument patterns, this research adopted a logic-based dialogue game approach with richer argument representations. In the DE dialogue game, the RL agent played against hard-coded heuristic agents and showed improved performance compared to the baseline agents by using a reward function that encourages the RL agent to win the game with minimum number of moves. This also allowed the RL agent to adopt its own strategy, make moves, and learn to argue. This thesis also presents a new reward function that makes the RL agent's dialogue more coherent and relevant than its opponents. The RL agent was designed to recognise argument patterns, i.e. argumentation schemes and evidence support sources, which can be related to different domains. The RL agent used a transfer learning method to generalise and transfer experiences and speed up learning

White Rose E-theses Online