119 research outputs found

    Belief State Planning for Autonomous Driving: Planning with Interaction, Uncertain Prediction and Uncertain Perception

    Get PDF
    This thesis presents a behavior planning algorithm for automated driving in urban environments with an uncertain and dynamic nature. The uncertainty in the environment arises by the fact that the intentions as well as the future trajectories of the surrounding drivers cannot be measured directly but can only be estimated in a probabilistic fashion. Even the perception of objects is uncertain due to sensor noise or possible occlusions. When driving in such environments, the autonomous car must predict the behavior of the other drivers and plan safe, comfortable and legal trajectories. Planning such trajectories requires robust decision making when several high-level options are available for the autonomous car. Current planning algorithms for automated driving split the problem into different subproblems, ranging from discrete, high-level decision making to prediction and continuous trajectory planning. This separation of one problem into several subproblems, combined with rule-based decision making, leads to sub-optimal behavior. This thesis presents a global, closed-loop formulation for the motion planning problem which intertwines action selection and corresponding prediction of the other agents in one optimization problem. The global formulation allows the planning algorithm to make the decision for certain high-level options implicitly. Furthermore, the closed-loop manner of the algorithm optimizes the solution for various, future scenarios concerning the future behavior of the other agents. Formulating prediction and planning as an intertwined problem allows for modeling interaction, i.e. the future reaction of the other drivers to the behavior of the autonomous car. The problem is modeled as a partially observable Markov decision process (POMDP) with a discrete action and a continuous state and observation space. The solution to the POMDP is a policy over belief states, which contains different reactive plans for possible future scenarios. Surrounding drivers are modeled with interactive, probabilistic agent models to account for their prediction uncertainty. The field of view of the autonomous car is simulated ahead over the whole planning horizon during the optimization of the policy. Simulating the possible, corresponding, future observations allows the algorithm to select actions that actively reduce the uncertainty of the world state. Depending on the scenario, the behavior of the autonomous car is optimized in (combined lateral and) longitudinal direction. The algorithm is formulated in a generic way and solved online, which allows for applying the algorithm on various road layouts and scenarios. While such a generic problem formulation is intractable to solve exactly, this thesis demonstrates how a sufficiently good approximation to the optimal policy can be found online. The problem is solved by combining state of the art Monte Carlo tree search algorithms with near-optimal, domain specific roll-outs. The algorithm is evaluated in scenarios such as the crossing of intersections under unknown intentions of other crossing vehicles, interactive lane changes in narrow gaps and decision making at intersections with large occluded areas. It is shown that the behavior of the closed-loop planner is less conservative than comparable open-loop planners. More precisely, it is even demonstrated that the policy enables the autonomous car to drive in a similar way as an omniscient planner with full knowledge of the scene. It is also demonstrated how the autonomous car executes actions to actively gather more information about the surrounding and to reduce the uncertainty of its belief state

    Belief State Planning for Autonomous Driving: Planning with Interaction, Uncertain Prediction and Uncertain Perception

    Get PDF
    This work presents a behavior planning algorithm for automated driving in urban environments with an uncertain and dynamic nature. The algorithm allows to consider the prediction uncertainty (e.g. different intentions), perception uncertainty (e.g. occlusions) as well as the uncertain interactive behavior of the other agents explicitly. Simulating the most likely future scenarios allows to find an optimal policy online that enables non-conservative planning under uncertainty

    Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs

    Get PDF
    Interactive partially observable Markov decision processes (I-POMDP) provide a formal framework for planning for a self-interested agent in multiagent settings. An agent operating in a multiagent environment must deliberate about the actions that other agents may take and the effect these actions have on the environment and the rewards it receives. Traditional I-POMDPs model this dependence on the actions of other agents using joint action and model spaces. Therefore, the solution complexity grows exponentially with the number of agents thereby complicating scalability. In this paper, we model and extend anonymity and context-specific independence -- problem structures often present in agent populations -- for computational gain. We empirically demonstrate the efficiency from exploiting these problem structures by solving a new multiagent problem involving more than 1,000 agents.Comment: 8 page article plus two page appendix containing proofs in Proceedings of 25th International Conference on Autonomous Planning and Scheduling, 201

    Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

    Get PDF
    Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agentʼs sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (POMDPs) provide a planning framework that optimally trades between actions that contribute to the agentʼs knowledge and actions that increase the agentʼs immediate reward. However, the task of specifying the POMDPʼs parameters is often onerous. In particular, setting the immediate rewards to achieve a desired balance between information-gathering and acting is often not intuitive. In this work, we propose an approximation based on minimizing the immediate Bayes risk for choosing actions when transition, observation, and reward models are uncertain. The Bayes-risk criterion avoids the computational intractability of solving a POMDP with a multi-dimensional continuous state space; we show it performs well in a variety of problems. We use policy queries—in which we ask an expert for the correct action—to infer the consequences of a potential pitfall without experiencing its effects. More important for human–robot interaction settings, policy queries allow the agent to learn the reward model without the reward values ever being specified

    Environment Search Planning Subject to High Robot Localization Uncertainty

    Get PDF
    As robots find applications in more complex roles, ranging from search and rescue to healthcare and services, they must be robust to greater levels of localization uncertainty and uncertainty about their environments. Without consideration for such uncertainties, robots will not be able to compensate accordingly, potentially leading to mission failure or injury to bystanders. This work addresses the task of searching a 2D area while reducing localization uncertainty. Wherein, the environment provides low uncertainty pose updates from beacons with a short range, covering only part of the environment. Otherwise the robot localizes using dead reckoning, relying on wheel encoder and yaw rate information from a gyroscope. As such, outside of the regions with position updates, there will be unconstrained localization error growth over time. The work contributes a Belief Markov Decision Process formulation for solving the search problem and evaluates the performance using Partially Observable Monte Carlo Planning (POMCP). Additionally, the work contributes an approximate Markov Decision Process formulation and reduced complexity state representation. The approximate problem is evaluated using value iteration. To provide a baseline, the Google OR-Tools package is used to solve the travelling salesman problem (TSP). Results are verified by simulating a differential drive robot in the Gazebo simulation environment. POMCP results indicate planning can be tuned to prioritize constraining uncertainty at the cost of increasing path length. The MDP formulation provides consistently lower uncertainty with minimal increases in path length over the TSP solution. Both formulations show improved coverage outcomes
    • …
    corecore