7 research outputs found

    Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving

    Full text link
    Behavior and motion planning play an important role in automated driving. Traditionally, behavior planners instruct local motion planners with predefined behaviors. Due to the high scene complexity in urban environments, unpredictable situations may occur in which behavior planners fail to match predefined behavior templates. Recently, general-purpose planners have been introduced, combining behavior and local motion planning. These general-purpose planners allow behavior-aware motion planning given a single reward function. However, two challenges arise: First, this function has to map a complex feature space into rewards. Second, the reward function has to be manually tuned by an expert. Manually tuning this reward function becomes a tedious task. In this paper, we propose an approach that relies on human driving demonstrations to automatically tune reward functions. This study offers important insights into the driving style optimization of general-purpose planners with maximum entropy inverse reinforcement learning. We evaluate our approach based on the expected value difference between learned and demonstrated policies. Furthermore, we compare the similarity of human driven trajectories with optimal policies of our planner under learned and expert-tuned reward functions. Our experiments show that we are able to learn reward functions exceeding the level of manual expert tuning without prior domain knowledge.Comment: Appeared at IROS 2019. Accepted version. Added/updated footnote, minor correction in preliminarie

    Softstar: Heuristic-guided probabilistic inference

    Get PDF
    Recent machine learning methods for sequential behavior prediction estimate the motives of behavior rather than the behavior itself. This higher-level abstraction improves generalization in different prediction settings, but computing predictions often becomes intractable in large decision spaces. We propose the Softstar algorithm, a softened heuristic-guided search technique for the maximum entropy inverse optimal control model of sequential behavior. This approach supports probabilistic search with bounded approximation error at a significantly reduced computational cost when compared to sampling based methods. We present the algorithm, analyze approximation guarantees, and compare performance with simulation-based inference on two distinct complex decision tasks

    Synthesizing robotic handwriting motion by learning from human demonstrations

    Get PDF
    This paper contributes a novel framework that enables a robotic agent to efficiently learn and synthesize believable handwriting motion. We situate the framework as a foundation with the goal of allowing children to observe, correct and engage with the robot to learn themselves the handwriting skill. The framework adapts the principle behind ensemble methods - where improved performance is obtained by combining the output of multiple simple algorithms - in an inverse optimal control problem. This integration addresses the challenges of rapid extraction and representation of multiple-mode motion trajectories, with the cost forms which are transferable and interpretable in the development of the robot compliance control. It also introduces the incorporation of a human movement inspired feature, which provides intuitive motion modulation to generalize the synthesis with poor robotic written samples for children to identify and correct. We present the results on the success of synthesizing a variety of natural-looking motion samples based upon the learned cost functions. The framework is validated by a user study, where the synthesized dynamical motion is shown to be hard to distinguish from the real human handwriting.info:eu-repo/semantics/publishedVersio

    Synthesizing Robotic Handwriting Motion by Learning from Human Demonstrations

    Get PDF
    This paper contributes a novel framework that enables a robotic agent to efficiently learn and synthesize believable handwriting motion. We situate the framework as a foundation with the goal of allowing children to observe, correct and engage with the robot to learn themselves the handwriting skill. The framework adapts the principle behind ensemble methods - where improved performance is obtained by combining the output of multiple simple algorithms - in an inverse optimal control problem. This integration addresses the challenges of rapid extraction and representation of multiple-mode motion trajectories, with the cost forms which are transferable and interpretable in the development of the robot compliance control. It also introduces the incorporation of a human movement inspired feature, which provides intuitive motion modulation to generalize the synthesis with poor robotic written samples for children to identify and correct. We present the results on the success of synthesizing a variety of natural-looking motion samples based upon the learned cost functions. The framework is validated by a user study, where the synthesized dynamical motion is shown to be hard to distinguish from the real human handwriting

    Modeling Driver Behavior From Demonstrations in Dynamic Environments Using Spatiotemporal Lattices

    Get PDF
    International audienceOne of the most challenging tasks in the development of path planners for intelligent vehicles is the design of the cost function that models the desired behavior of the vehicle. While this task has been traditionally accomplished by hand-tuning the model parameters, recent approaches propose to learn the model automatically from demonstrated driving data using Inverse Reinforcement Learning (IRL). To determine if the model has correctly captured the demonstrated behavior, most IRL methods require obtaining a policy by solving the forward control problem repetitively. Calculating the full policy is a costly task in continuous or large domains and thus often approximated by finding a single trajectory using traditional path-planning techniques. In this work, we propose to find such a trajectory using a conformal spatiotemporal state lattice, which offers two main advantages. First, by conforming the lattice to the environment, the search is focused only on feasible motions for the robot, saving computational power. And second, by considering time as part of the state, the trajectory is optimized with respect to the motion of the dynamic obstacles in the scene. As a consequence, the resulting trajectory can be used for the model assessment. We show how the proposed IRL framework can successfully handle highly dynamic environments by modeling the highway tactical driving task from demonstrated driving data gathered with an instrumented vehicle
    corecore