4 research outputs found

    Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning

    Full text link
    Reinforcement learning is an approach used by intelligent agents to autonomously learn new skills. Although reinforcement learning has been demonstrated to be an effective learning approach in several different contexts, a common drawback exhibited is the time needed in order to satisfactorily learn a task, especially in large state-action spaces. To address this issue, interactive reinforcement learning proposes the use of externally-sourced information in order to speed up the learning process. Up to now, different information sources have been used to give advice to the learner agent, among them human-sourced advice. When interacting with a learner agent, humans may provide either evaluative or informative advice. From the agent's perspective these styles of interaction are commonly referred to as reward-shaping and policy-shaping respectively. Evaluation requires the human to provide feedback on the prior action performed, while informative advice they provide advice on the best action to select for a given situation. Prior research has focused on the effect of human-sourced advice on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while reducing the engagement with the human. This work presents an experimental setup for a human-trial designed to compare the methods people use to deliver advice in term of human engagement. Obtained results show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent's ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice.Comment: 33 pages, 15 figure

    Multi-modal feedback for affordance-driven interactive reinforcement learning

    Get PDF
    Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parentlike trainers during a task. In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback. Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback. The integration process also considers the case in which the two modalities convey incongruent information. Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances.We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task. In our experimental setup, we explore the interplay of multi-modal feedback and task-specific affordances in a robot cleaning scenario. We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances. Our experiments show that the best performance is obtained by using audio-visual feedback with affordance-modulated IRL. The obtained results demonstrate the importance of multi-modal sensory processing integrated with goaloriented knowledge in IRL tasks.In press. Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), IEEE World Congress on Computational Intelligence (WCCI), Rio de Janeiro, Brazil, July 2018.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    Enhancing reinforcement learning with a context-based approach

    Get PDF
    Reinforcement Learning (RL) has shown outstanding capabilities in solving complex computational problems. However, most RL algorithms lack an explicit method for learning from contextual information. In reality, humans rely on context to identify patterns and relations among elements in the environment and determine how to avoid making incorrect actions. Conversely, what may seem like obvious poor decisions from a human perspective could take hundreds of steps for an agent to learn how to avoid them. This thesis aims to investigate methods for incorporating contextual information into RL in order to enhance learning performance. The research follows an incremental approach in which, first, contextual information is incorporated into RL in simulated environments, more concisely in games. The experiments show that all the algorithms which use contextual information significantly outperform the baseline algorithms by 77 % on average. Then, the concept is validated with a hybrid approach that comprises a robot in a Human-Robot Interaction (HRI) scenario dealing with rigid objects. The robot learns in simulation while executing actions in the real world. For this setup, based on contextual information, the proposed algorithm trains in a reduced amount of time (2.7 seconds). It reaches an 84% success rate in a grasp and release-related task while interacting with a human user, while the baseline algorithm with the highest success rate reached 68% after learning during a significantly longer period of time (91.8 seconds). Consequently, CQL suits the robot鈥檚 learning requirements in observing the current scenario configuration and learning to solve it while dealing with dynamic changes provoked by the user. Additionally, the thesis explores using an RL framework that uses contextual information to learn how to manipulate bags in the real world. A bag is a deformable object that presents challenges from grasping to planning, and RL has the potential to address this issue. The learning process is accomplished through a new RL algorithm introduced in this work called 螤-learning, designed to find the best grasping points of the bag based on a set of compact state representations. The framework utilises a set of primitive actions and represents the task in five states. In the experiments, the framework reaches a 60% and 80% success rate after around three hours of training in the real world when starting the bagging task from folded and unfolded positions, respectively. Finally, the trained model is tested on two more bags of different sizes to evaluate its generalisation capacities. Overall, this research seeks to contribute to the broader advancement of RL and robotics, aiming to enhance the development of intelligent, autonomous systems that can effectively operate in diverse and dynamic real-world settings. Besides that, this research seeks to explore new possibilities for automation, HRI, and the utilisation of contextual information in RL
    corecore