237 research outputs found

    PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

    Full text link
    We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling based path planning with reinforcement learning (RL). The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology. Next, the sampling-based planners provide roadmaps which connect robot configurations that can be successfully navigated by the RL agent. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL, both in simulation and on-robot, on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. Our results show improvement in task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 m long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 m without violating the task constraints in an environment 63 million times larger than used in training.Comment: 9 pages, 7 figure

    High-Dimensional Motion Planning and Learning Under Uncertain Conditions

    Get PDF
    Many existing path planning methods do not adequately account for uncertainty. Without uncertainty these existing techniques work well, but in real world environments they struggle due to inaccurate sensor models, arbitrarily moving obstacles, and uncertain action consequences. For example, picking up and storing childrens toys is a simple task for humans. Yet, for a robotic household robot the task can be daunting. The room must be modeled with sensors, which may or may not detect all the strewn toys. The robot must be able to detect and avoid the child who may be moving the very toys that the robot is tasked with cleaning. Finally, if the robot missteps and places a foot on a toy, it must be able to compensate for the unexpected consequences of its actions. This example demonstrates that even simple human tasks are fraught with uncertainties that must be accounted for in robotic path planning algorithms. This work presents the first steps towards migrating sampling-based path planning methods to real world environments by addressing three different types of uncertainty: (1) model uncertainty, (2) spatio-temporal obstacle uncertainty (moving obstacles) and (3) action consequence uncertainty. Uncertainty is encoded directly into path planning through a data structure in order to successfully and efficiently identify safe robot paths in sensed environments with noise. This encoding produces comparable clearance paths to other planning methods which are a known for high clearance, but at an order of magnitude less computational cost. It also shows that formal control theory methods combined with path planning provides a technique that has a 95% collision-free navigation rate with 300 moving obstacles. Finally, it demonstrates that reinforcement learning can be combined with planning data structures to autonomously learn motion controls of a seven degree of freedom robot despite a low computational cost despite the number of dimensions

    Learning to reach and reaching to learn: a unified approach to path planning and reactive control through reinforcement learning

    Get PDF
    The next generation of intelligent robots will need to be able to plan reaches. Not just ballistic point to point reaches, but reaches around things such as the edge of a table, a nearby human, or any other known object in the robot’s workspace. Planning reaches may seem easy to us humans, because we do it so intuitively, but it has proven to be a challenging problem, which continues to limit the versatility of what robots can do today. In this document, I propose a novel intrinsically motivated RL system that draws on both Path/Motion Planning and Reactive Control. Through Reinforcement Learning, it tightly integrates these two previously disparate approaches to robotics. The RL system is evaluated on a task, which is as yet unsolved by roboticists in practice. That is to put the palm of the iCub humanoid robot on arbitrary target objects in its workspace, start- ing from arbitrary initial configurations. Such motions can be generated by planning, or searching the configuration space, but this typically results in some kind of trajectory, which must then be tracked by a separate controller, and such an approach offers a brit- tle runtime solution because it is inflexible. Purely reactive systems are robust to many problems that render a planned trajectory infeasible, but lacking the capacity to search, they tend to get stuck behind constraints, and therefore do not replace motion planners. The planner/controller proposed here is novel in that it deliberately plans reaches without the need to track trajectories. Instead, reaches are composed of sequences of reactive motion primitives, implemented by my Modular Behavioral Environment (MoBeE), which provides (fictitious) force control with reactive collision avoidance by way of a realtime kinematic/geometric model of the robot and its workspace. Thus, to the best of my knowledge, mine is the first reach planning approach to simultaneously offer the best of both the Path/Motion Planning and Reactive Control approaches. By controlling the real, physical robot directly, and feeling the influence of the con- straints imposed by MoBeE, the proposed system learns a stochastic model of the iCub’s configuration space. Then, the model is exploited as a multiple query path planner to find sensible pre-reach poses, from which to initiate reaching actions. Experiments show that the system can autonomously find practical reaches to target objects in workspace and offers excellent robustness to changes in the workspace configuration as well as noise in the robot’s sensory-motor apparatus

    Learning Inverse Statics Models Efficiently With Symmetry-Based Exploration

    Get PDF
    Learning (inverse) kinematics and dynamics models of dexterous robots for the entire action or observation space is challenging and costly. Sampling the entire space is usually intractable in terms of time, tear, and wear. We propose an efficient approach to learn inverse statics models—primarily for gravity compensation—by exploring only a small part of the configuration space and exploiting the symmetry properties of the inverse statics mapping. In particular, there exist symmetric configurations that require the same absolute motor torques to be maintained. We show that those symmetric configurations can be discovered, the functional relations between them can be successfully learned and exploited to generate multiple training samples from one sampled configuration-torque pair. This strategy drastically reduces the number of samples required for learning inverse statics models. Moreover, we demonstrate that exploiting symmetries for learning inverse statics models is a generally applicable strategy for online and offline learning algorithms. We exemplify this by two different learning approaches. First, we modify the Direction Sampling approach for learning inverse statics models online, in a plain exploratory fashion, from scratch and without using a closed-loop controller. Second, we show that inverse statics mappings can be efficiently learned offline utilizing lattice sampling. Results for a 2R planar robot and a 3R simplified human arm demonstrate that their inverse statics mappings can be learned successfully for the entire configuration space. Furthermore, we demonstrate that the number of samples required for learning inverse statics mappings for 2R and 3R manipulators can be reduced at least by factors of approximately 8 and 16, respectively–depending on the number of discovered symmetries

    Interactions Between Humans and Robots

    Get PDF
    • …
    corecore