174 research outputs found

    Scalable Verification of Markov Decision Processes

    Get PDF
    Markov decision processes (MDP) are useful to model concurrent process optimisation problems, but verifying them with numerical methods is often intractable. Existing approximative approaches do not scale well and are limited to memoryless schedulers. Here we present the basis of scalable verification for MDPSs, using an O(1) memory representation of history-dependent schedulers. We thus facilitate scalable learning techniques and the use of massively parallel verification.Comment: V4: FMDS version, 12 pages, 4 figure

    Feature Selection by Singular Value Decomposition for Reinforcement Learning

    Get PDF
    Solving reinforcement learning problems using value function approximation requires having good state features, but constructing them manually is often difficult or impossible. We propose Fast Feature Selection (FFS), a new method for automatically constructing good features in problems with high-dimensional state spaces but low-rank dynamics. Such problems are common when, for example, controlling simple dynamic systems using direct visual observations with states represented by raw images. FFS relies on domain samples and singular value decomposition to construct features that can be used to approximate the optimal value function well. Compared with earlier methods, such as LFD, FFS is simpler and enjoys better theoretical performance guarantees. Our experimental results show that our approach is also more stable, computes better solutions, and can be faster when compared with prior work

    Finding optimal paths on dynamic road networks

    Get PDF
    Ce document examine différentes méthodes pour calculer des chemins optimaux sur des graphes dynamiques. Deux grandes approches sont comparées: l’approche déterministe et l’approche probabiliste. L’approche déterministe prend pour acquise une certaine connaissance préalable des changements à venir dans l’environnement. L’approche probabiliste tente de modéliser et traiter l’incertitude. Une variante dynamique de l’algorithme de Dijkstra est détaillée dans le contexte déterministe. Les paradigmes des Markov Decision Processes (MDP) et Partially Observable Markov Decision Processes sont explorés dans le cadre du problème probabiliste. Des applications et mesures sont présentées pour chaque approche. On constate une relation inverse entre la calculabilité des approches proposées et leur potentiel d’application pratique. L’approche déterministe représente une solution très efficace à une version simplifiée du problème. Les POMDP s’avèrent un moyen théorique puissant dont l’implantation est impossible pour des problèmes de grande taille. Une alternative est proposée dans ce mémoire à l’aide des MDP.This document examines different methods to compute optimal paths on dynamic graphs. Two general approaches are compared: deterministic and probabilistic. The deterministic approach takes for granted knowledge of the environment’s future behaviour. The probabilistic approach attempts to model and manage uncertainty. A dynamic version of Dijkstra’s algorithm is presented for the deterministic solution. Markov Decision Processes and Partially Observable Markov Decision Processes are analysed for the probabilistic context. Applications and measures of performance are given for each approach. We observe a reverse relationship between computability and applicability of the different approaches. Deterministic approaches prove a fast and efficient way to solve simpler versions of the problem. POMDPs are a powerful theoretical model that offers little potential of application. An alternative is described through the use of MDPs

    Formal Modelling for Multi-Robot Systems Under Uncertainty

    Get PDF
    Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investigated models which more accurately capture multi-robot execution by considering different forms of uncertainty, such as temporal uncertainty and partial observability, and modelling the effects of robot interactions on action execution. Other strands of work have presented approaches for reducing the size of multi-robot models to admit more efficient solution methods. This can be achieved by decoupling the robots under independence assumptions, or reasoning over higher level macro actions. Summary: Existing multi-robot models demonstrate a trade off between accurately capturing robot dependencies and uncertainty, and being small enough to tractably solve real world problems. Therefore, future research should exploit realistic assumptions over multi-robot behaviour to develop smaller models which retain accurate representations of uncertainty and robot interactions; and exploit the structure of multi-robot problems, such as factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://dx.doi.org/10.1007/s43154-023-00104-

    Parameter Synthesis for Markov Models

    Full text link
    Markov chain analysis is a key technique in reliability engineering. A practical obstacle is that all probabilities in Markov models need to be known. However, system quantities such as failure rates or packet loss ratios, etc. are often not---or only partially---known. This motivates considering parametric models with transitions labeled with functions over parameters. Whereas traditional Markov chain analysis evaluates a reliability metric for a single, fixed set of probabilities, analysing parametric Markov models focuses on synthesising parameter values that establish a given reliability or performance specification φ\varphi. Examples are: what component failure rates ensure the probability of a system breakdown to be below 0.00000001?, or which failure rates maximise reliability? This paper presents various analysis algorithms for parametric Markov chains and Markov decision processes. We focus on three problems: (a) do all parameter values within a given region satisfy φ\varphi?, (b) which regions satisfy φ\varphi and which ones do not?, and (c) an approximate version of (b) focusing on covering a large fraction of all possible parameter values. We give a detailed account of the various algorithms, present a software tool realising these techniques, and report on an extensive experimental evaluation on benchmarks that span a wide range of applications.Comment: 38 page
    • …
    corecore