14 research outputs found
Enhancing reinforcement learning with a context-based approach
Reinforcement Learning (RL) has shown outstanding capabilities in solving complex
computational problems. However, most RL algorithms lack an explicit method
for learning from contextual information. In reality, humans rely on context to
identify patterns and relations among elements in the environment and determine
how to avoid making incorrect actions. Conversely, what may seem like obvious
poor decisions from a human perspective could take hundreds of steps for an agent
to learn how to avoid them. This thesis aims to investigate methods for incorporating
contextual information into RL in order to enhance learning performance.
The research follows an incremental approach in which, first, contextual information is incorporated into RL in simulated environments, more concisely in games.
The experiments show that all the algorithms which use contextual information significantly outperform the baseline algorithms by 77 % on average. Then, the concept
is validated with a hybrid approach that comprises a robot in a Human-Robot Interaction (HRI) scenario dealing with rigid objects. The robot learns in simulation
while executing actions in the real world. For this setup, based on contextual information, the proposed algorithm trains in a reduced amount of time (2.7 seconds).
It reaches an 84% success rate in a grasp and release-related task while interacting with a human user, while the baseline algorithm with the highest success rate
reached 68% after learning during a significantly longer period of time (91.8 seconds). Consequently, CQL suits the robot’s learning requirements in observing the
current scenario configuration and learning to solve it while dealing with dynamic
changes provoked by the user.
Additionally, the thesis explores using an RL framework that uses contextual information to learn how to manipulate bags in the real world. A bag is a deformable
object that presents challenges from grasping to planning, and RL has the potential
to address this issue. The learning process is accomplished through a new RL algorithm introduced in this work called Î -learning, designed to find the best grasping
points of the bag based on a set of compact state representations. The framework
utilises a set of primitive actions and represents the task in five states. In the experiments, the framework reaches a 60% and 80% success rate after around three
hours of training in the real world when starting the bagging task from folded and
unfolded positions, respectively. Finally, the trained model is tested on two more
bags of different sizes to evaluate its generalisation capacities.
Overall, this research seeks to contribute to the broader advancement of RL and
robotics, aiming to enhance the development of intelligent, autonomous systems that
can effectively operate in diverse and dynamic real-world settings. Besides that, this
research seeks to explore new possibilities for automation, HRI, and the utilisation of contextual information in RL
Context-sensitive personalities and behaviors for robots
This paper proposes Context-Sensitive Behaviors for Robots (CSBR), a method for generating diverse behaviors for robots in indoor environments based on five personality traits. This method is based on a novel model developed in this work that reacts to a synthetic genome that defines the personality of the robot. The model functions return different answers and reactions, depending on a given spoken request. The responses of the robot included spoken answers, facial animations, gestures, and actions. The novelty of this method lies in its capacity to adapt the behavior of the robot according to the context of the request. Moreover, the model is scalable since its functions not only return spoken answers but also physical responses, such as opening a gripper, saying hello with gestures, or animating a face that represents an emotion according to the context. Changes in the parameters of the synthetic genome produce different behaviors. By defining different synthetic genomes, robots can adapt to different people's moods. In this work, we introduce two scenarios for human–robot interaction in two domestic environments (house and office) through spoken requests from a human user. We implemented our method in Care-O-Bot 4 and defined three synthetic genomes to produce three behaviors: friendly, detached, and hostile. In the considered scenarios, we asked the robot the same set of requests for every synthetic genome. Not only did Care-O-Bot 4 answer according to its personality, but it also proved that our method produces different behaviors. For these scenarios, we assume that the given request includes its connotation. Since our method has characteristics influenced by context, we show that the robot's behavior changed according to the human mood and the environment
Learning to bag with a simulation-free reinforcement learning framework for robots
Bagging is an essential skill that humans perform in their daily activities.
However, deformable objects, such as bags, are complex for robots to
manipulate. This paper presents an efficient learning-based framework that
enables robots to learn bagging. The novelty of this framework is its ability
to perform bagging without relying on simulations. The learning process is
accomplished through a reinforcement learning algorithm introduced in this
work, designed to find the best grasping points of the bag based on a set of
compact state representations. The framework utilizes a set of primitive
actions and represents the task in five states. In our experiments, the
framework reaches a 60 % and 80 % of success rate after around three hours of
training in the real world when starting the bagging task from folded and
unfolded, respectively. Finally, we test the trained model with two more bags
of different sizes to evaluate its generalizability.Comment: IET Cyber-Systems and Robotic
Deep reinforcement learning with explicit context representation
Though Reinforcement learning (RL) has shown an outstanding capability for solving complex computational problems, most RL algorithms lack an explicit method that would allow learning from contextual information. On the other hand, humans often use context to identify patterns and relations among elements in the environment, along with how to avoid making wrong actions. However, what may seem like an obviously wrong decision from a human perspective could take hundreds of steps for an RL agent to learn to avoid. This paper proposes a framework for discrete environments called Iota explicit context representation (IECR). The framework involves representing each state using contextual key frames (CKFs), which can then be used to extract a function that represents the affordances of the state; in addition, two loss functions are introduced with respect to the affordances of the state. The novelty of the IECR framework lies in its capacity to extract contextual information from the environment and learn from the CKFs’ representation. We validate the framework by developing four new algorithms that learn using context: Iota deep Q-network (IDQN), Iota double deep Q-network (IDDQN), Iota dueling deep Q-network (IDuDQN), and Iota dueling double deep Q-network (IDDDQN). Furthermore, we evaluate the framework and the new algorithms in five discrete environments. We show that all the algorithms, which use contextual information, converge in around 40,000 training steps of the neural networks, significantly outperforming their state-of-the-art equivalents
AR and HRC integration for Enhanced Pragmatic Quality
In the landscape of modern manufacturing, Human-Robot Collaboration (HRC) has evolved to be an indispensable element in facilitating synchronized task execution between humans and their robotic counterparts. The infusion of augmented reality (AR) into HRC, particularly in AR-integrated assembly procedures, introduces a promising dimension to the assembly process. This research examines whether AR-enhanced assembly procedures can facilitate HRC. Central to our investigation is the operational implications and the potential enrichment of the operator's pragmatic quality. Our distinct methodological approach puts the spotlight on the holistic experience of operators in AR-integrated HRC scenarios. Our results underscore the AR assembly procedure's notable benefits in terms of increased effectiveness, elevated user satisfaction reinforcing its value in HRC contexts
Affordance-Based Human-Robot Interaction With Reinforcement Learning
Planning precise manipulation in robotics to perform grasp and release-related operations, while interacting with humans is a challenging problem. Reinforcement learning (RL) has the potential to make robots attain this capability. In this paper, we propose an affordance-based human-robot interaction (HRI) framework, aiming to reduce the action space size that would considerably impede the exploration efficiency of the agent. The framework is based on a new algorithm called Contextual Q-learning (CQL). We first show that the proposed algorithm trains in a reduced amount of time (2.7 seconds) and reaches an 84% of success rate. This suits the robot’s learning efficiency to observe the current scenario configuration and learn to solve it. Then, we empirically validate the framework for implementation in HRI real-world scenarios. During the HRI, the robot uses semantic information from the state and the optimal policy of the last training step to search for relevant changes in the environment that may trigger the generation of a new policy
AR and HRC Integration for Enhanced Pragmatic Quality
In the landscape of modern manufacturing, Human-Robot Collaboration (HRC) has evolved to be an indispensable element in facilitating synchronized task execution between humans and their robotic counterparts. The infusion of augmented reality (AR) into HRC, particularly in AR-integrated assembly procedures, introduces a promising dimension to the assembly process. This research examines whether AR-enhanced assembly procedures can facilitate HRC. Central to our investigation is the operational implications and the potential enrichment of the operator's pragmatic quality. Our distinct methodological approach puts the spotlight on the holistic experience of operators in AR-integrated HRC scenarios. Our results underscore the AR assembly procedure's notable benefits in terms of increased effectiveness, elevated user satisfaction reinforcing its value in HRC contexts