Search CORE

1,808 research outputs found

Goal-oriented Dialogue Policy Learning from Failures

Author: Chen Xiaoping
Lu Keting
Zhang Shiqi
Publication venue
Publication date: 22/11/2018
Field of study

Reinforcement learning methods have been used for learning dialogue policies. However, learning an effective dialogue policy frequently requires prohibitively many conversations. This is partly because of the sparse rewards in dialogues, and the very few successful dialogues in early learning phase. Hindsight experience replay (HER) enables learning from failures, but the vanilla HER is inapplicable to dialogue learning due to the implicit goals. In this work, we develop two complex HER methods providing different trade-offs between complexity and performance, and, for the first time, enabled HER-based dialogue policy learning. Experiments using a realistic user simulator show that our HER methods perform better than existing experience replay methods (as applied to deep Q-networks) in learning rate

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Introduction for speech and language for interactive robots

Author: Argentieri
Athanasopoulos
Bordes
Chen
Cuayáhuitl
Cuayáhuitl
Ferreira
Gabriel Skantze
Heriberto Cuayáhuitl
Kazunori Komatani
Lee
Lison
Lorenzo-Trueba
Mavridis
Misu
Nose
Yoshino
Zukerman
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

This special issue includes research articles which apply spoken language processing to robots that interact with human users through speech, possibly combined with other modalities. Robots that can listen to human speech, understand it, interact according to the conveyed meaning, and respond represent major research and technological challenges. Their common aim is to equip robots with natural interaction abilities. However, robotics and spoken language processing are areas that are typically studied within their respective communities with limited communication across disciplinary boundaries. The articles in this special issue represent examples that address the need for an increased multidisciplinary exchange of ideas

University of Lincoln Institutional Repository

Crossref

Heriot Watt Pure

Reinforcement Learning Approaches in Social Robotics

Author: Akalin Neziha
Loutfi Amy
Publication venue
Publication date: 01/02/2021
Field of study

This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field

arXiv.org e-Print Archive

Directory of Open Access Journals

Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Author: Gasic Milica
Mrksic Nikola
Su Pei-Hao
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue
Publication date: 01/01/2015
Field of study

Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these concerns. Here we examine three recurrent neural network (RNN) approaches for providing reward shaping information in addition to the primary (task-orientated) environmental feedback. These RNNs are trained on returns from dialogues generated by a simulated user and attempt to diffuse the overall evaluation of the dialogue back down to the turn level to guide the agent towards good behaviour faster. In both simulated and real user scenarios these RNNs are shown to increase policy learning speed. Importantly, they do not require prior knowledge of the user's goal.Comment: Accepted for publication in SigDial 201

arXiv.org e-Print Archive

Crossref

Dialogue management using reinforcement learning

Author: Fakhrurroja Hanif
Machbub Carmadi
Rofi’ah Binashir
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2021
Field of study

Dialogue has been widely used for verbal communication between human and robot interaction, such as assistant robot in hospital. However, this robot was usually limited by predetermined dialogue, so it will be difficult to understand new words for new desired goal. In this paper, we discussed conversation in Indonesian on entertainment, motivation, emergency, and helping with knowledge growing method. We provided mp3 audio for music, fairy tale, comedy request, and motivation. The execution time for this request was 3.74 ms on average. In emergency situation, patient able to ask robot to call the nurse. Robot will record complaint of pain and inform nurse. From 7 emergency reports, all complaints were successfully saved on database. In helping conversation, robot will walk to pick up belongings of patient. Once the robot did not understand with patient’s conversation, robot will ask until it understands. From asking conversation, knowledge expands from 2 to 10, with learning execution from 1405 ms to 3490 ms. SARSA was faster towards steady state because of higher cumulative rewards. Q-learning and SARSA were achieved desired object within 200 episodes. It concludes that RL method to overcome robot knowledge limitation in achieving new dialogue goal for patient assistant were achieved

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Hybrid chat and task dialogue for more engaging hri using reinforcement learning

Author: Dondrup Christian
Lemon Oliver
Novikova Jekaterina
Papaioannou Ioannis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/12/2017
Field of study

Heriot Watt Pure

Speeding-up Action Learning in a Social Robot with Dyna-Q+: A Bioinspired Probabilistic Model Approach

Author: Castro González Álvaro
González González Rodrigo
Malfaz Vázquez María Ángeles
Maroto Gómez Marcos
Salichs Sánchez-Caballero Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2021
Field of study

Robotic systems that are developed for social and dynamic environments require adaptive mechanisms to successfully operate. Consequently, learning from rewards has provided meaningful results in applications involving human-robot interaction. In those cases where the robot's state space and the number of actions is extensive, dimensionality becomes intractable and this drastically slows down the learning process. This effect is specially notorious in one-step temporal difference methods because just one update is performed per robot-environment interaction. In this paper, we prove how the action-based learning of a social robot can be improved by combining classical temporal difference reinforcement learning methods, such as Q-learning or Q( λ), with a probabilistic model of the environment. This architecture, which we have called Dyna, allows the robot to simultaneously act and plan using the experience obtained during real human-robot interactions. Principally, Dyna improves classical algorithms in terms of convergence speed and stability, which strengthens the learning process. Hence, in this work we have embedded a Dyna architecture in our social robot, Mini, to endow it with the ability to autonomously maintain an optimal internal state while living in a dynamic environment

Universidad Carlos III de Madrid e-Archivo