Search CORE

269 research outputs found

Motion Planning for Autonomous Vehicles in Partially Observable Environments

Author: Taş Ömer Şahin
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 08/12/2022
Field of study

Unsicherheiten, welche aus Sensorrauschen oder nicht beobachtbaren Manöverintentionen anderer Verkehrsteilnehmer resultieren, akkumulieren sich in der Datenverarbeitungskette eines autonomen Fahrzeugs und führen zu einer unvollständigen oder fehlinterpretierten Umfeldrepräsentation. Dadurch weisen Bewegungsplaner in vielen Fällen ein konservatives Verhalten auf. Diese Dissertation entwickelt zwei Bewegungsplaner, welche die Defizite der vorgelagerten Verarbeitungsmodule durch Ausnutzung der Reaktionsfähigkeit des Fahrzeugs kompensieren. Diese Arbeit präsentiert zuerst eine ausgiebige Analyse über die Ursachen und Klassifikation der Unsicherheiten und zeigt die Eigenschaften eines idealen Bewegungsplaners auf. Anschließend befasst sie sich mit der mathematischen Modellierung der Fahrziele sowie den Randbedingungen, welche die Sicherheit gewährleisten. Das resultierende Planungsproblem wird mit zwei unterschiedlichen Methoden in Echtzeit gelöst: Zuerst mit nichtlinearer Optimierung und danach, indem es als teilweise beobachtbarer Markov-Entscheidungsprozess (POMDP) formuliert und die Lösung mit Stichproben angenähert wird. Der auf nichtlinearer Optimierung basierende Planer betrachtet mehrere Manöveroptionen mit individuellen Auftrittswahrscheinlichkeiten und berechnet daraus ein Bewegungsprofil. Er garantiert Sicherheit, indem er die Realisierbarkeit einer zufallsbeschränkten Rückfalloption gewährleistet. Der Beitrag zum POMDP-Framework konzentriert sich auf die Verbesserung der Stichprobeneffizienz in der Monte-Carlo-Planung. Erstens werden Informationsbelohnungen definiert, welche die Stichproben zu Aktionen führen, die eine höhere Belohnung ergeben. Dabei wird die Auswahl der Stichproben für das reward-shaped Problem durch die Verwendung einer allgemeinen Heuristik verbessert. Zweitens wird die Kontinuität in der Reward-Struktur für die Aktionsauswahl ausgenutzt und dadurch signifikante Leistungsverbesserungen erzielt. Evaluierungen zeigen, dass mit diesen Planern große Erfolge in Fahrversuchen und Simulationsstudien mit komplexen Interaktionsmodellen erreicht werden

KITopen

Role Playing Learning for Socially Concomitant Mobile Robot Navigation

Author: Ge Shuzhi Sam
Jiang Rui
Lee Tong Heng
Li Mingming
Publication venue
Publication date: 29/05/2017
Field of study

In this paper, we present the Role Playing Learning (RPL) scheme for a mobile robot to navigate socially with its human companion in populated environments. Neural networks (NN) are constructed to parameterize a stochastic policy that directly maps sensory data collected by the robot to its velocity outputs, while respecting a set of social norms. An efficient simulative learning environment is built with maps and pedestrians trajectories collected from a number of real-world crowd data sets. In each learning iteration, a robot equipped with the NN policy is created virtually in the learning environment to play itself as a companied pedestrian and navigate towards a goal in a socially concomitant manner. Thus, we call this process Role Playing Learning, which is formulated under a reinforcement learning (RL) framework. The NN policy is optimized end-to-end using Trust Region Policy Optimization (TRPO), with consideration of the imperfectness of robot's sensor measurements. Simulative and experimental results are provided to demonstrate the efficacy and superiority of our method

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

ScholarBank@NUS

Health Aware Stochastic Planning For Persistent Package Delivery Missions Using Quadrotors

Author: Agha-mohammadi Ali-akbar
How Jonathan P.
Ure Nazim Kemal
Vian John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In persistent missions, taking system’s health and capability degradation into account is an essential factor to predict and avoid failures. The state space in health-aware planning problems is often a mixture of continuous vehicle-level and discrete mission-level states. This in particular poses a challenge when the mission domain is partially observable and restricts the use of computationally expensive forward search methods. This paper presents a method that exploits a structure that exists in many health-aware planning problems and performs a two-layer planning scheme. The lower layer exploits the local linearization and Gaussian distribution assumption over vehicle-level states while the higher layer maintains a non-Gaussian distribution over discrete mission-level variables. This two-layer planning scheme allows us to limit the expensive online forward search to the mission-level states, and thus predict system’s behavior over longer horizons in the future. We demonstrate the performance of the method on a long duration package delivery mission using a quadrotor in a partially-observable domain in the presence of constraints and health/capability degradation

CiteSeerX

DSpace@MIT

Crossref

Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning

Author: Fan Tingxiang
Liao Xinyi
Liu Wenxi
Long Pinxin
Pan Jia
Zhang Hao
Publication venue
Publication date: 20/05/2018
Field of study

Developing a safe and efficient collision avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generate its paths without observing other robots' states and intents. While other distributed multi-robot collision avoidance systems exist, they often require extracting agent-level features to plan a local collision-free action, which can be computationally prohibitive and not robust. More importantly, in practice the performance of these methods are much lower than their centralized counterparts. We present a decentralized sensor-level collision avoidance policy for multi-robot systems, which directly maps raw sensor measurements to an agent's steering commands in terms of movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to find an optimal policy which is trained over a large number of robots on rich, complex environments simultaneously using a policy gradient based reinforcement learning algorithm. We validate the learned sensor-level collision avoidance policy in a variety of simulated scenarios with thorough performance evaluations and show that the final learned policy is able to find time efficient, collision-free paths for a large-scale robot system. We also demonstrate that the learned policy can be well generalized to new scenarios that do not appear in the entire training period, including navigating a heterogeneous group of robots and a large-scale scenario with 100 robots. Videos are available at https://sites.google.com/view/drlmac

arXiv.org e-Print Archive

Crossref

Recommended from our members

Abstractions in Reasoning for Long-Term Autonomy

Author: Wray Kyle Hollins
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

The path to building adaptive, robust, intelligent agents has led researchers to develop a suite of powerful models and algorithms for agents with a single objective. However, in recent years, attempts to use this monolithic approach to solve an ever-expanding set of complex real-world problems, which increasingly include long-term autonomous deployments, have illuminated challenges in its ability to scale. Consequently, a fragmented collection of hierarchical and multi-objective models were developed. This trend continues into the algorithms as well, as each approximates an optimal solution in a different manner for scalability. These models and algorithms represent an attempt to solve pieces of an overarching problem: how can an agent explicitly model and integrate the necessary aspects of reasoning required to achieve long-term autonomy? This thesis presents a general hierarchical and multi-objective model called a policy network that unifies prior fragmented solutions into a single graphical decision-making structure. Policy networks are broadly useful to solve numerous real-world problems. This thesis focuses on autonomous vehicle (AV) problems: (1) route-planning with multiple objectives; (2) semi-autonomy with proactive transfer of control; and (3) intersection decision-making for reasoning online about any number of other vehicles and pedestrians. Formal models are presented for each of the distinct problems. Solutions are evaluated using real-world map data in simulation and demonstrated on a fully operational AV prototype driving on real public roads. Policy networks serve as a shared underlying framework for all three, enabling their seamless integration as parts of an overall solution for rich, real-world, scalable decision-making in agents with long-term autonomy

ScholarWorks@UMass Amherst