4,069 research outputs found

    Test moment determination design in active robot learning

    Get PDF
    A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, service robots have been increasingly used in people's daily live. These robots are autonomous or semiautonomous and are able to cooperate with their human users. Active robot learning (ARL) is an approach to the development of beliefs for the robots on their users' intention and preference, which is needed by the robots to facilitate the seamless cooperation with humans. This approach allows a robot to perform tests on its users and to build up the high-order beliefs according to the users' responses. This study carried out primary research on designing the test moment determination component in ARL framework. The test moment determination component is used to decide right moment of taking a test action. In this study, an action plan theory was suggested to synthesis actions into a sequence, that is, an action plan, for a given task. All actions are defined in a special format of precondition, action, post-condition and testing time. Forward chaining reasoning was introduced to establish connection between the actions and to synthesis individual actions into an action plan, corresponding to the given task. A simulation environment was set up where a human user and a service robot were modelled using MATLAB. Fuzzy control was employed for controlling the robot to carry out the cooperative action. In order to examine the effect of test moment determination component, simulations were performed to execute a scenario where a robot passes on an object to a human user. The simulation results show that an action plan can be formed according to provided conditions and executed by simulated models properly. Test actions were taken at the moment determined by the test moment determination component to find the human user's intention

    DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

    Full text link
    We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explainability in the field of visual analytics

    Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning

    Full text link
    Our work aims at efficiently leveraging ambiguous demonstrations for the training of a reinforcement learning (RL) agent. An ambiguous demonstration can usually be interpreted in multiple ways, which severely hinders the RL-Agent from learning stably and efficiently. Since an optimal demonstration may also suffer from being ambiguous, previous works that combine RL and learning from demonstration (RLfD works) may not work well. Inspired by how humans handle such situations, we propose to use self-explanation (an agent generates explanations for itself) to recognize valuable high-level relational features as an interpretation of why a successful trajectory is successful. This way, the agent can provide some guidance for its RL learning. Our main contribution is to propose the Self-Explanation for RL from Demonstrations (SERLfD) framework, which can overcome the limitations of traditional RLfD works. Our experimental results show that an RLfD model can be improved by using our SERLfD framework in terms of training stability and performance

    Reinforcement Learning Approaches in Social Robotics

    Full text link
    This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field

    Framework of active robot learning

    Get PDF
    A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, cognitive robots have become an attractive research area of Artificial Intelligent (AI). High-order beliefs for cognitive robots regard the robots' thought about their users' intention and preference. The existing approaches to the development of such beliefs through machine learning rely on particular social cues or specifically defined award functions . Therefore, their applications can be limited. This study carried out primary research on active robot learning (ARL) which facilitates a robot to develop high-order beliefs by actively collecting/discovering evidence it needs. The emphasis is on active learning, but not teaching. Hence, social cues and award functions are not necessary. In this study, the framework of ARL was developed. Fuzzy logic was employed in the framework for controlling robot and for identifying high-order beliefs. A simulation environment was set up where a human and a cognitive robot were modelled using MATLAB, and ARL was implemented through simulation. Simulations were also performed in this study where the human and the robot tried to jointly lift a stick and keep the stick level. The simulation results show that under the framework a robot is able to discover the evidence it needs to confirm its user's intention

    Exploration of genetic network programming with two-stage reinforcement learning for mobile robot

    Get PDF
    This paper observes the exploration of Genetic Network Programming Two-Stage Reinforcement Learning for mobile robot navigation. The proposed method aims to observe its exploration when inexperienced environments used in the implementation. In order to deal with this situation, individuals are trained firstly in the training phase, that is, they learn the environment with ϵ-greedy policy and learning rate α parameters. Here, two cases are studied, i.e., case A for low exploration and case B for high exploration. In the implementation, the individuals implemented to get experience and learn a new environment on-line. Then, the performance of learning processes are observed due to the environmental changes

    Human Machine Interaction

    Get PDF
    In this book, the reader will find a set of papers divided into two sections. The first section presents different proposals focused on the human-machine interaction development process. The second section is devoted to different aspects of interaction, with a special emphasis on the physical interaction
    corecore