4,069 research outputs found
Test moment determination design in active robot learning
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, service robots have been increasingly used in people's daily live.
These robots are autonomous or semiautonomous and are able to cooperate with their human users. Active robot learning (ARL) is an approach to the development of beliefs for the robots on their users' intention and preference, which is needed by the robots to facilitate the seamless cooperation with humans. This approach allows a robot to perform tests on its users and to build up the high-order beliefs according to the users' responses.
This study carried out primary research on designing the test moment determination component in ARL framework. The test moment determination component is used to decide right moment of taking a test action. In this study, an action plan theory was suggested to synthesis actions into a sequence, that is, an action plan, for a given task.
All actions are defined in a special format of precondition, action, post-condition and testing time. Forward chaining reasoning was introduced to establish connection between the actions and to synthesis individual actions into an action plan, corresponding to the given task. A simulation environment was set up where a human user and a service robot were modelled using MATLAB. Fuzzy control was employed for controlling the robot to carry out the cooperative action.
In order to examine the effect of test moment determination component, simulations were performed to execute a scenario where a robot passes on an object to a human user. The simulation results show that an action plan can be formed according to provided conditions and executed by simulated models properly. Test actions were taken at the moment determined by the test moment determination component to find the human user's intention
DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning
We present DRLViz, a visual analytics interface to interpret the internal
memory of an agent (e.g. a robot) trained using deep reinforcement learning.
This memory is composed of large temporal vectors updated when the agent moves
in an environment and is not trivial to understand due to the number of
dimensions, dependencies to past vectors, spatial/temporal correlations, and
co-correlation between dimensions. It is often referred to as a black box as
only inputs (images) and outputs (actions) are intelligible for humans. Using
DRLViz, experts are assisted to interpret decisions using memory reduction
interactions, and to investigate the role of parts of the memory when errors
have been made (e.g. wrong direction). We report on DRLViz applied in the
context of video games simulators (ViZDoom) for a navigation scenario with item
gathering tasks. We also report on experts evaluation using DRLViz, and
applicability of DRLViz to other scenarios and navigation problems beyond
simulation games, as well as its contribution to black box models
interpretability and explainability in the field of visual analytics
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning
Our work aims at efficiently leveraging ambiguous demonstrations for the
training of a reinforcement learning (RL) agent. An ambiguous demonstration can
usually be interpreted in multiple ways, which severely hinders the RL-Agent
from learning stably and efficiently. Since an optimal demonstration may also
suffer from being ambiguous, previous works that combine RL and learning from
demonstration (RLfD works) may not work well. Inspired by how humans handle
such situations, we propose to use self-explanation (an agent generates
explanations for itself) to recognize valuable high-level relational features
as an interpretation of why a successful trajectory is successful. This way,
the agent can provide some guidance for its RL learning. Our main contribution
is to propose the Self-Explanation for RL from Demonstrations (SERLfD)
framework, which can overcome the limitations of traditional RLfD works. Our
experimental results show that an RLfD model can be improved by using our
SERLfD framework in terms of training stability and performance
深層強化学習を用いた動的環境下における事前知識不要なロボットナビゲーションに関する研究
Tohoku University博士(工学)thesi
Reinforcement Learning Approaches in Social Robotics
This article surveys reinforcement learning approaches in social robotics.
Reinforcement learning is a framework for decision-making problems in which an
agent interacts through trial-and-error with its environment to discover an
optimal behavior. Since interaction is a key component in both reinforcement
learning and social robotics, it can be a well-suited approach for real-world
interactions with physically embodied social robots. The scope of the paper is
focused particularly on studies that include social physical robots and
real-world human-robot interactions with users. We present a thorough analysis
of reinforcement learning approaches in social robotics. In addition to a
survey, we categorize existent reinforcement learning approaches based on the
used method and the design of the reward mechanisms. Moreover, since
communication capability is a prominent feature of social robots, we discuss
and group the papers based on the communication medium used for reward
formulation. Considering the importance of designing the reward function, we
also provide a categorization of the papers based on the nature of the reward.
This categorization includes three major themes: interactive reinforcement
learning, intrinsically motivated methods, and task performance-driven methods.
The benefits and challenges of reinforcement learning in social robotics,
evaluation methods of the papers regarding whether or not they use subjective
and algorithmic measures, a discussion in the view of real-world reinforcement
learning challenges and proposed solutions, the points that remain to be
explored, including the approaches that have thus far received less attention
is also given in the paper. Thus, this paper aims to become a starting point
for researchers interested in using and applying reinforcement learning methods
in this particular research field
Framework of active robot learning
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, cognitive robots have become an attractive research area of Artificial Intelligent (AI). High-order beliefs for cognitive robots regard the robots' thought about their users' intention and preference. The existing approaches to the development of such beliefs through machine learning rely on particular social cues or specifically defined award functions . Therefore, their applications can be limited.
This study carried out primary research on active robot learning (ARL) which facilitates a robot to develop high-order beliefs by actively collecting/discovering evidence it needs. The emphasis is on active learning, but not teaching. Hence, social cues and award functions are not necessary. In this study, the framework of ARL was developed. Fuzzy logic was employed in the framework for controlling robot and for identifying high-order beliefs. A simulation environment was set up where a human and a cognitive robot were modelled using MATLAB, and ARL was implemented through simulation.
Simulations were also performed in this study where the human and the robot tried to jointly lift a stick and keep the stick level. The simulation results show that under the framework a robot is able to discover the evidence it needs to confirm its user's intention
Exploration of genetic network programming with two-stage reinforcement learning for mobile robot
This paper observes the exploration of Genetic Network Programming Two-Stage Reinforcement Learning for mobile robot navigation. The proposed method aims to observe its exploration when inexperienced environments used in the implementation. In order to deal with this situation, individuals are trained firstly in the training phase, that is, they learn the environment with ϵ-greedy policy and learning rate α parameters. Here, two cases are studied, i.e., case A for low exploration and case B for high exploration. In the implementation, the individuals implemented to get experience and learn a new environment on-line. Then, the performance of learning processes are observed due to the environmental changes
Human Machine Interaction
In this book, the reader will find a set of papers divided into two sections. The first section presents different proposals focused on the human-machine interaction development process. The second section is devoted to different aspects of interaction, with a special emphasis on the physical interaction
- …