12,935 research outputs found

    Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks

    Get PDF
    Diversity of environments is a key challenge that causes learned robotic controllers to fail due to the discrepancies between the training and evaluation conditions. Training from demonstrations in various conditions can mitigate---but not completely prevent---such failures. Learned controllers such as neural networks typically do not have a notion of uncertainty that allows to diagnose an offset between training and testing conditions, and potentially intervene. In this work, we propose to use Bayesian Neural Networks, which have such a notion of uncertainty. We show that uncertainty can be leveraged to consistently detect situations in high-dimensional simulated and real robotic domains in which the performance of the learned controller would be sub-par. Also, we show that such an uncertainty based solution allows making an informed decision about when to invoke a fallback strategy. One fallback strategy is to request more data. We empirically show that providing data only when requested results in increased data-efficiency.Comment: Copyright 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Model-Based Reinforcement Learning with Continuous States and Actions

    No full text
    Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment

    Gaussian-Process-based Robot Learning from Demonstration

    Full text link
    Endowed with higher levels of autonomy, robots are required to perform increasingly complex manipulation tasks. Learning from demonstration is arising as a promising paradigm for transferring skills to robots. It allows to implicitly learn task constraints from observing the motion executed by a human teacher, which can enable adaptive behavior. We present a novel Gaussian-Process-based learning from demonstration approach. This probabilistic representation allows to generalize over multiple demonstrations, and encode variability along the different phases of the task. In this paper, we address how Gaussian Processes can be used to effectively learn a policy from trajectories in task space. We also present a method to efficiently adapt the policy to fulfill new requirements, and to modulate the robot behavior as a function of task variability. This approach is illustrated through a real-world application using the TIAGo robot.Comment: 8 pages, 10 figure

    Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks

    Full text link
    A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable to establish a common focus of attention and be able to use and integrate spoken instructions, visual perceptions, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and an modality fusion module to allow multi-modal task-oriented man-machine communication with respect to dextrous robot manipulation of objects.Comment: 7 pages, 8 figure

    Multi-Information Source Fusion and Optimization to Realize ICME: Application to Dual Phase Materials

    Get PDF
    Integrated Computational Materials Engineering (ICME) calls for the integration of computational tools into the materials and parts development cycle, while the Materials Genome Initiative (MGI) calls for the acceleration of the materials development cycle through the combination of experiments, simulation, and data. As they stand, both ICME and MGI do not prescribe how to achieve the necessary tool integration or how to efficiently exploit the computational tools, in combination with experiments, to accelerate the development of new materials and materials systems. This paper addresses the first issue by putting forward a framework for the fusion of information that exploits correlations among sources/models and between the sources and `ground truth'. The second issue is addressed through a multi-information source optimization framework that identifies, given current knowledge, the next best information source to query and where in the input space to query it via a novel value-gradient policy. The querying decision takes into account the ability to learn correlations between information sources, the resource cost of querying an information source, and what a query is expected to provide in terms of improvement over the current state. The framework is demonstrated on the optimization of a dual-phase steel to maximize its strength-normalized strain hardening rate. The ground truth is represented by a microstructure-based finite element model while three low fidelity information sources---i.e. reduced order models---based on different homogenization assumptions---isostrain, isostress and isowork---are used to efficiently and optimally query the materials design space.Comment: 19 pages, 11 figures, 5 table
    • …
    corecore