4 research outputs found
Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning
Reinforcement learning is an approach used by intelligent agents to
autonomously learn new skills. Although reinforcement learning has been
demonstrated to be an effective learning approach in several different
contexts, a common drawback exhibited is the time needed in order to
satisfactorily learn a task, especially in large state-action spaces. To
address this issue, interactive reinforcement learning proposes the use of
externally-sourced information in order to speed up the learning process. Up to
now, different information sources have been used to give advice to the learner
agent, among them human-sourced advice. When interacting with a learner agent,
humans may provide either evaluative or informative advice. From the agent's
perspective these styles of interaction are commonly referred to as
reward-shaping and policy-shaping respectively. Evaluation requires the human
to provide feedback on the prior action performed, while informative advice
they provide advice on the best action to select for a given situation. Prior
research has focused on the effect of human-sourced advice on the interactive
reinforcement learning process, specifically aiming to improve the learning
speed of the agent, while reducing the engagement with the human. This work
presents an experimental setup for a human-trial designed to compare the
methods people use to deliver advice in term of human engagement. Obtained
results show that users giving informative advice to the learner agents provide
more accurate advice, are willing to assist the learner agent for a longer
time, and provide more advice per episode. Additionally, self-evaluation from
participants using the informative approach has indicated that the agent's
ability to follow the advice is higher, and therefore, they feel their own
advice to be of higher accuracy when compared to people providing evaluative
advice.Comment: 33 pages, 15 figure
Multi-modal feedback for affordance-driven interactive reinforcement learning
Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parentlike trainers during a task. In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback. Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback. The integration process also considers the case in which the two modalities convey incongruent information. Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances.We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task. In our experimental setup, we explore the interplay of multi-modal feedback and task-specific affordances in a robot cleaning scenario. We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances. Our experiments show that the best performance is obtained by using audio-visual feedback with affordance-modulated IRL. The obtained results demonstrate the importance of multi-modal sensory processing integrated with goaloriented knowledge in IRL tasks.In press. Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), IEEE World Congress on Computational Intelligence (WCCI), Rio de Janeiro, Brazil, July 2018.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ
Enhancing reinforcement learning with a context-based approach
Reinforcement Learning (RL) has shown outstanding capabilities in solving complex
computational problems. However, most RL algorithms lack an explicit method
for learning from contextual information. In reality, humans rely on context to
identify patterns and relations among elements in the environment and determine
how to avoid making incorrect actions. Conversely, what may seem like obvious
poor decisions from a human perspective could take hundreds of steps for an agent
to learn how to avoid them. This thesis aims to investigate methods for incorporating
contextual information into RL in order to enhance learning performance.
The research follows an incremental approach in which, first, contextual information is incorporated into RL in simulated environments, more concisely in games.
The experiments show that all the algorithms which use contextual information significantly outperform the baseline algorithms by 77 % on average. Then, the concept
is validated with a hybrid approach that comprises a robot in a Human-Robot Interaction (HRI) scenario dealing with rigid objects. The robot learns in simulation
while executing actions in the real world. For this setup, based on contextual information, the proposed algorithm trains in a reduced amount of time (2.7 seconds).
It reaches an 84% success rate in a grasp and release-related task while interacting with a human user, while the baseline algorithm with the highest success rate
reached 68% after learning during a significantly longer period of time (91.8 seconds). Consequently, CQL suits the robot鈥檚 learning requirements in observing the
current scenario configuration and learning to solve it while dealing with dynamic
changes provoked by the user.
Additionally, the thesis explores using an RL framework that uses contextual information to learn how to manipulate bags in the real world. A bag is a deformable
object that presents challenges from grasping to planning, and RL has the potential
to address this issue. The learning process is accomplished through a new RL algorithm introduced in this work called 螤-learning, designed to find the best grasping
points of the bag based on a set of compact state representations. The framework
utilises a set of primitive actions and represents the task in five states. In the experiments, the framework reaches a 60% and 80% success rate after around three
hours of training in the real world when starting the bagging task from folded and
unfolded positions, respectively. Finally, the trained model is tested on two more
bags of different sizes to evaluate its generalisation capacities.
Overall, this research seeks to contribute to the broader advancement of RL and
robotics, aiming to enhance the development of intelligent, autonomous systems that
can effectively operate in diverse and dynamic real-world settings. Besides that, this
research seeks to explore new possibilities for automation, HRI, and the utilisation of contextual information in RL