Search CORE

5 research outputs found

Probabilistic Preference Planning Problem for Markov Decision Processes

Author: Hahn Ernst Moritz
Li Meilun
She Zhikun
Turrini Andrea
Zhang Lijun
Publication venue
Publication date: 01/05/2022
Field of study

University of Twente Research Information

Recommended from our members

Abstractions in Reasoning for Long-Term Autonomy

Author: Wray Kyle Hollins
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

The path to building adaptive, robust, intelligent agents has led researchers to develop a suite of powerful models and algorithms for agents with a single objective. However, in recent years, attempts to use this monolithic approach to solve an ever-expanding set of complex real-world problems, which increasingly include long-term autonomous deployments, have illuminated challenges in its ability to scale. Consequently, a fragmented collection of hierarchical and multi-objective models were developed. This trend continues into the algorithms as well, as each approximates an optimal solution in a different manner for scalability. These models and algorithms represent an attempt to solve pieces of an overarching problem: how can an agent explicitly model and integrate the necessary aspects of reasoning required to achieve long-term autonomy? This thesis presents a general hierarchical and multi-objective model called a policy network that unifies prior fragmented solutions into a single graphical decision-making structure. Policy networks are broadly useful to solve numerous real-world problems. This thesis focuses on autonomous vehicle (AV) problems: (1) route-planning with multiple objectives; (2) semi-autonomy with proactive transfer of control; and (3) intersection decision-making for reasoning online about any number of other vehicles and pedestrians. Formal models are presented for each of the distinct problems. Solutions are evaluated using real-world map data in simulation and demonstrated on a fully operational AV prototype driving on real public roads. Policy networks serve as a shared underlying framework for all three, enabling their seamless integration as parts of an overall solution for rich, real-world, scalable decision-making in agents with long-term autonomy

ScholarWorks@UMass Amherst

Path planning for mobile robots in the real world: handling multiple objectives, hierarchical structures and partial information

Author: Gambardella Luca Maria
Giusti Alessandro
Guzzi Jérôme
Publication venue
Publication date: 01/10/2018
Field of study

Autonomous robots in real-world environments face a number of challenges even to accomplish apparently simple tasks like moving to a given location. We present four realistic scenarios in which robot navigation takes into account partial information, hierarchical structures, and multiple objectives. We start by discussing navigation in indoor environments shared with people, where routes are characterized by effort, risk, and social impact. Next, we improve navigation by computing optimal trajectories and implementing human-friendly local navigation behaviors. Finally, we move to outdoor environments, where robots rely on uncertain traversability estimations and need to account for the risk of getting stuck or having to change route

RERO DOC Digital Library

Programmation dynamique avec approximation de la fonction valeur

Author: Munos Rémi
Publication venue: 'Revista Cientifica Hermes'
Publication date: 01/01/2008
Field of study

L'utilisation d'outils pour l'approximation de la fonction de valeur est essentielle pour pouvoir traiter des problèmes de prise de décisions séquentielles de grande taille. Les méthodes de programmation dynamique (PD) et d'apprentissage par renforcement (A/R) introduites aux chapitres 1 et 2 supposent que la fonction de valeur peut être représentée (mémorisée) en attribuant une valeur à chaque état (dont le nombre est supposé fini), par exemple sous la forme d'un tableau. Ces méthodes de résolution, dites exactes, permettent de déterminer la solution optimale du problème considéré (ou tout au moins de converger vers cette solution optimale). Cependant, elles ne s'appliquent souvent qu'à des problèmes jouets, car pour la plupart des applications intéressantes, le nombre d'états possibles est si grand (voire infini dans le cas d'espaces continus) qu'une représentation exacte de la fonction ne peut être parfaitement mémorisée. Il devient alors nécessaire de représenter la fonction de valeur, de manière approchée, à l'aide d'un nombre modéré de coefficients, et de redéfinir et analyser des méthodes de résolution, dites approchées pour la PD et l'A/R, afin de prendre en compte les conséquences de l'utilisation de telles approximations dans les problèmes de prise de décisions séquentielles

HAL - Lille 3

INRIA a CCSD electronic archive server

Vector-Value Markov Decision Process for multi-objective stochastic path planning

Author: Mouaddib Abdel-Illah
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceThe problem of path planning in stochastic environments where the shortest path is not always the best one is a challenging issue required in many real-world applications such as autonomous vehicles, robotics, logistics, etc. . . . In this paper, we consider the problem of path planning in stochastic environments where the length of the path is not the unique criterion to consider. We formalize this problem as a multi-objective decision-theoretic path planning and we transform this latter into 2VMDP (Vector-Valued Markov Decision Process). We show, then, how we can compute a policy balancing between different considered criteria. We describe different techniques that allow us to derive an optimal policy where it is hard to express the expected utilities, rewards and values with a unique numerical measure. Firstly, we examine different existing approaches based on preferences and we define notions of optimality with preferred solutions and secondly we present approaches based on egalitarian social welfare techniques. Finally, some experimental results have been developed to show the feasibility of the approach and the benefit of this approach on the single-objective techniques

HAL - Normandie Université