366 research outputs found

    Safety-Aware Robot Damage Recovery Using Constrained Bayesian Optimization and Simulated Priors

    Get PDF
    International audienceThe recently introduced Intelligent Trial-and-Error (IT&E) algorithm showed that robots can adapt to damage in a matter of a few trials. The success of this algorithm relies on two components: prior knowledge acquired through simulation with an intact robot, and Bayesian optimization (BO) that operates on-line, on the damaged robot. While IT&E leads to fast damage recovery, it does not incorporate any safety constraints that prevent the robot from attempting harmful behaviors. In this work, we address this limitation by replacing the BO component with a constrained BO procedure. We evaluate our approach on a simulated damaged humanoid robot that needs to crawl as fast as possible, while performing as few unsafe trials as possible. We compare our new " safety-aware IT&E " algorithm to IT&E and a multi-objective version of IT&E in which the safety constraints are dealt as separate objectives. Our results show that our algorithm outperforms the other approaches, both in crawling speed within the safe regions and number of unsafe trials

    Bayesian Optimization with Automatic Prior Selection for Data-Efficient Direct Policy Search

    Get PDF
    One of the most interesting features of Bayesian optimization for direct policy search is that it can leverage priors (e.g., from simulation or from previous tasks) to accelerate learning on a robot. In this paper, we are interested in situations for which several priors exist but we do not know in advance which one fits best the current situation. We tackle this problem by introducing a novel acquisition function, called Most Likely Expected Improvement (MLEI), that combines the likelihood of the priors and the expected improvement. We evaluate this new acquisition function on a transfer learning task for a 5-DOF planar arm and on a possibly damaged, 6-legged robot that has to learn to walk on flat ground and on stairs, with priors corresponding to different stairs and different kinds of damages. Our results show that MLEI effectively identifies and exploits the priors, even when there is no obvious match between the current situations and the priors.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 1 algorithm; Video at https://youtu.be/xo8mUIZTvNE ; Spotlight ICRA presentation https://youtu.be/iiVaV-U6Kq

    A survey on policy search algorithms for learning robot controllers in a handful of trials

    Get PDF
    Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word "big-data", we refer to this challenge as "micro-data reinforcement learning". We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time.Comment: 21 pages, 3 figures, 4 algorithms, accepted at IEEE Transactions on Robotic

    A survey on policy search algorithms for learning robot controllers in a handful of trials

    Get PDF
    International audienceMost policy search (PS) algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word “big-data,” we refer to this challenge as “micro-data reinforcement learning.” In this article, we show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based PS), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots, designing generic priors, and optimizing the computing time

    20 Years of Reality Gap: a few Thoughts about Simulators in Evolutionary Robotics

    Get PDF
    International audienceSimulators in Evolutionary Robotics (ER) are often considered as a "temporary evil" until experiments can be conducted on real robots. Yet, aaer more than 20 years of ER, most experiments still happen in simulation and nothing suggests that this situation will change in the next few years. In this short paper, we describe the requirements of ER from simulators, what we tried, and how we successfully crossed the "reality gap" in many experiments. We argue that future simulators need to be able to estimate their conndence when they predict a tness value, so that behaviors that are not accurately simulated can be avoided

    Physics-Informed Computer Vision: A Review and Perspectives

    Full text link
    Incorporation of physical information in machine learning frameworks are opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of formulation and approaches to computer vision tasks guided by physical laws. We begin by decomposing the popular computer vision pipeline into a taxonomy of stages and investigate approaches to incorporate governing physical equations in each stage. Existing approaches in each task are analyzed with regard to what governing physical processes are modeled, formulated and how they are incorporated, i.e. modify data (observation bias), modify networks (inductive bias), and modify losses (learning bias). The taxonomy offers a unified view of the application of the physics-informed capability, highlighting where physics-informed learning has been conducted and where the gaps and opportunities are. Finally, we highlight open problems and challenges to inform future research. While still in its early days, the study of physics-informed computer vision has the promise to develop better computer vision models that can improve physical plausibility, accuracy, data efficiency and generalization in increasingly realistic applications

    Safe Robot Planning and Control Using Uncertainty-Aware Deep Learning

    Get PDF
    In order for robots to autonomously operate in novel environments over extended periods of time, they must learn and adapt to changes in the dynamics of their motion and the environment. Neural networks have been shown to be a versatile and powerful tool for learning dynamics and semantic information. However, there is reluctance to deploy these methods on safety-critical or high-risk applications, since neural networks tend to be black-box function approximators. Therefore, there is a need for investigation into how these machine learning methods can be safely leveraged for learning-based controls, planning, and traversability. The aim of this thesis is to explore methods for both establishing safety guarantees as well as accurately quantifying risks when using deep neural networks for robot planning, especially in high-risk environments. First, we consider uncertainty-aware Bayesian Neural Networks for adaptive control, and introduce a method for guaranteeing safety under certain assumptions. Second, we investigate deep quantile regression learning methods for learning time-and-state varying uncertainties, which we use to perform trajectory optimization with Model Predictive Control. Third, we introduce a complete framework for risk-aware traversability and planning, which we use to enable safe exploration of extreme environments. Fourth, we again leverage deep quantile regression and establish a method for accurately learning the distribution of traversability risks in these environments, which can be used to create safety constraints for planning and control.Ph.D

    Belief Representations for Planning with Contact Uncertainty

    Full text link
    While reaching for your morning coffee you may accidentally bump into the table, yet you reroute your motion with ease and grab your cup. An effective autonomous robot will need to have a similarly seamless recovery from unexpected contact. As simple as this may seem, even sensing this contact is a challenge for many robots, and when detected contact is often treated as an error that an operator is expected to resolve. Robots operating in our daily environments will need to reason about the information they have gained from contact and replan autonomously. This thesis examines planning under uncertainty with contact sensitive robot arms. Robots do not have skin and cannot precisely sense the location of contact. This leads to the proposed Collision Hypothesis Set model for representing a belief over the possible occupancy of the world sensed through contact. To capture the specifics of planning in an unknown world with this measurement model, this thesis develops a POMDP approach called the Blindfolded Traveler's Problem. A good prior over the possible obstacles the robot might encounter is key to effective planning. This thesis develops a neural network approach for sampling potential obstacles that are consistent with both what a robot sees from its camera and what it feels through contact.PHDRoboticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169845/1/bsaund_1.pd
    • …
    corecore