Search CORE

174 research outputs found

Scalable Verification of Markov Decision Processes

Author: Legay Axel
Sedwards Sean
Traonouez Louis-Marie
Publication venue
Publication date: 02/09/2014
Field of study

Markov decision processes (MDP) are useful to model concurrent process optimisation problems, but verifying them with numerical methods is often intractable. Existing approximative approaches do not scale well and are limited to memoryless schedulers. Here we present the basis of scalable verification for MDPSs, using an O(1) memory representation of history-dependent schedulers. We thus facilitate scalable learning techniques and the use of massively parallel verification.Comment: V4: FMDS version, 12 pages, 4 figure

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Feature Selection by Singular Value Decomposition for Reinforcement Learning

Author: Behzadian Bahram
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/05/2019
Field of study

Solving reinforcement learning problems using value function approximation requires having good state features, but constructing them manually is often difficult or impossible. We propose Fast Feature Selection (FFS), a new method for automatically constructing good features in problems with high-dimensional state spaces but low-rank dynamics. Such problems are common when, for example, controlling simple dynamic systems using direct visual observations with states represented by raw images. FFS relies on domain samples and singular value decomposition to construct features that can be used to approximate the optimal value function well. Compared with earlier methods, such as LFD, FFS is simpler and enjoys better theoretical performance guarantees. Our experimental results show that our approach is also more stable, computes better solutions, and can be faster when compared with prior work

UNH Scholars' Repository

Finding optimal paths on dynamic road networks

Author: Bergeron Guyard Alexandre
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2008
Field of study

Ce document examine différentes méthodes pour calculer des chemins optimaux sur des graphes dynamiques. Deux grandes approches sont comparées: l’approche déterministe et l’approche probabiliste. L’approche déterministe prend pour acquise une certaine connaissance préalable des changements à venir dans l’environnement. L’approche probabiliste tente de modéliser et traiter l’incertitude. Une variante dynamique de l’algorithme de Dijkstra est détaillée dans le contexte déterministe. Les paradigmes des Markov Decision Processes (MDP) et Partially Observable Markov Decision Processes sont explorés dans le cadre du problème probabiliste. Des applications et mesures sont présentées pour chaque approche. On constate une relation inverse entre la calculabilité des approches proposées et leur potentiel d’application pratique. L’approche déterministe représente une solution très efficace à une version simplifiée du problème. Les POMDP s’avèrent un moyen théorique puissant dont l’implantation est impossible pour des problèmes de grande taille. Une alternative est proposée dans ce mémoire à l’aide des MDP.This document examines different methods to compute optimal paths on dynamic graphs. Two general approaches are compared: deterministic and probabilistic. The deterministic approach takes for granted knowledge of the environment’s future behaviour. The probabilistic approach attempts to model and manage uncertainty. A dynamic version of Dijkstra’s algorithm is presented for the deterministic solution. Markov Decision Processes and Partially Observable Markov Decision Processes are analysed for the probabilistic context. Applications and measures of performance are given for each approach. We observe a reverse relationship between computability and applicability of the different approaches. Deterministic approaches prove a fast and efficient way to solve simpler versions of the problem. POMDPs are a powerful theoretical model that offers little potential of application. An alternative is described through the use of MDPs

CorpusUL

Formal Modelling for Multi-Robot Systems Under Uncertainty

Author: Lacerda Bruno
Mansouri Masoumeh
Street Charlie
Publication venue
Publication date: 15/08/2023
Field of study

Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investigated models which more accurately capture multi-robot execution by considering different forms of uncertainty, such as temporal uncertainty and partial observability, and modelling the effects of robot interactions on action execution. Other strands of work have presented approaches for reducing the size of multi-robot models to admit more efficient solution methods. This can be achieved by decoupling the robots under independence assumptions, or reasoning over higher level macro actions. Summary: Existing multi-robot models demonstrate a trade off between accurately capturing robot dependencies and uncertainty, and being small enough to tractably solve real world problems. Therefore, future research should exploit realistic assumptions over multi-robot behaviour to develop smaller models which retain accurate representations of uncertainty and robot interactions; and exploit the structure of multi-robot problems, such as factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://dx.doi.org/10.1007/s43154-023-00104-

arXiv.org e-Print Archive

University of Birmingham Research Portal

Parameter Synthesis for Markov Models

Author: Abraham Erika
Hensel Christian
Jansen Nils
Junges Sebastian
Katoen Joost-Pieter
Quatmann Tim
Volk Matthias
Publication venue
Publication date: 16/03/2019
Field of study

Markov chain analysis is a key technique in reliability engineering. A practical obstacle is that all probabilities in Markov models need to be known. However, system quantities such as failure rates or packet loss ratios, etc. are often not---or only partially---known. This motivates considering parametric models with transitions labeled with functions over parameters. Whereas traditional Markov chain analysis evaluates a reliability metric for a single, fixed set of probabilities, analysing parametric Markov models focuses on synthesising parameter values that establish a given reliability or performance specification

\varphi

. Examples are: what component failure rates ensure the probability of a system breakdown to be below 0.00000001?, or which failure rates maximise reliability? This paper presents various analysis algorithms for parametric Markov chains and Markov decision processes. We focus on three problems: (a) do all parameter values within a given region satisfy

\varphi

?, (b) which regions satisfy

\varphi

and which ones do not?, and (c) an approximate version of (b) focusing on covering a large fraction of all possible parameter values. We give a detailed account of the various algorithms, present a software tool realising these techniques, and report on an extensive experimental evaluation on benchmarks that span a wide range of applications.Comment: 38 page

arXiv.org e-Print Archive