174 research outputs found
Scalable Verification of Markov Decision Processes
Markov decision processes (MDP) are useful to model concurrent process
optimisation problems, but verifying them with numerical methods is often
intractable. Existing approximative approaches do not scale well and are
limited to memoryless schedulers. Here we present the basis of scalable
verification for MDPSs, using an O(1) memory representation of
history-dependent schedulers. We thus facilitate scalable learning techniques
and the use of massively parallel verification.Comment: V4: FMDS version, 12 pages, 4 figure
Feature Selection by Singular Value Decomposition for Reinforcement Learning
Solving reinforcement learning problems using value function approximation requires having good state features, but constructing them manually is often difficult or impossible. We propose Fast Feature Selection (FFS), a new method for automatically constructing good features in problems with high-dimensional state spaces but low-rank dynamics. Such problems are common when, for example, controlling simple dynamic systems using direct visual observations with states represented by raw images. FFS relies on domain samples and singular value decomposition to construct features that can be used to approximate the optimal value function well. Compared with earlier methods, such as LFD, FFS is simpler and enjoys better theoretical performance guarantees. Our experimental results show that our approach is also more stable, computes better solutions, and can be faster when compared with prior work
Finding optimal paths on dynamic road networks
Ce document examine différentes méthodes pour calculer des chemins optimaux sur des graphes dynamiques. Deux grandes approches sont comparées: l’approche déterministe et l’approche probabiliste. L’approche déterministe prend pour acquise une certaine connaissance préalable des changements à venir dans l’environnement. L’approche probabiliste tente de modéliser et traiter l’incertitude. Une variante dynamique de l’algorithme de Dijkstra est détaillée dans le contexte déterministe. Les paradigmes des Markov Decision Processes (MDP) et Partially Observable Markov Decision Processes sont explorés dans le cadre du problème probabiliste. Des applications et mesures sont présentées pour chaque approche. On constate une relation inverse entre la calculabilité des approches proposées et leur potentiel d’application pratique. L’approche déterministe représente une solution très efficace à une version simplifiée du problème. Les POMDP s’avèrent un moyen théorique puissant dont l’implantation est impossible pour des problèmes de grande taille. Une alternative est proposée dans ce mémoire à l’aide des MDP.This document examines different methods to compute optimal paths on dynamic graphs. Two general approaches are compared: deterministic and probabilistic. The deterministic approach takes for granted knowledge of the environment’s future behaviour. The probabilistic approach attempts to model and manage uncertainty. A dynamic version of Dijkstra’s algorithm is presented for the deterministic solution. Markov Decision Processes and Partially Observable Markov Decision Processes are analysed for the probabilistic context. Applications and measures of performance are given for each approach. We observe a reverse relationship between computability and applicability of the different approaches. Deterministic approaches prove a fast and efficient way to solve simpler versions of the problem. POMDPs are a powerful theoretical model that offers little potential of application. An alternative is described through the use of MDPs
Formal Modelling for Multi-Robot Systems Under Uncertainty
Purpose of Review: To effectively synthesise and analyse multi-robot
behaviour, we require formal task-level models which accurately capture
multi-robot execution. In this paper, we review modelling formalisms for
multi-robot systems under uncertainty, and discuss how they can be used for
planning, reinforcement learning, model checking, and simulation.
Recent Findings: Recent work has investigated models which more accurately
capture multi-robot execution by considering different forms of uncertainty,
such as temporal uncertainty and partial observability, and modelling the
effects of robot interactions on action execution. Other strands of work have
presented approaches for reducing the size of multi-robot models to admit more
efficient solution methods. This can be achieved by decoupling the robots under
independence assumptions, or reasoning over higher level macro actions.
Summary: Existing multi-robot models demonstrate a trade off between
accurately capturing robot dependencies and uncertainty, and being small enough
to tractably solve real world problems. Therefore, future research should
exploit realistic assumptions over multi-robot behaviour to develop smaller
models which retain accurate representations of uncertainty and robot
interactions; and exploit the structure of multi-robot problems, such as
factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This
version of the article has been accepted for publication, after peer review
(when applicable) but is not the Version of Record and does not reflect
post-acceptance improvements, or any corrections. The Version of Record is
available online at: https://dx.doi.org/10.1007/s43154-023-00104-
Parameter Synthesis for Markov Models
Markov chain analysis is a key technique in reliability engineering. A
practical obstacle is that all probabilities in Markov models need to be known.
However, system quantities such as failure rates or packet loss ratios, etc.
are often not---or only partially---known. This motivates considering
parametric models with transitions labeled with functions over parameters.
Whereas traditional Markov chain analysis evaluates a reliability metric for a
single, fixed set of probabilities, analysing parametric Markov models focuses
on synthesising parameter values that establish a given reliability or
performance specification . Examples are: what component failure rates
ensure the probability of a system breakdown to be below 0.00000001?, or which
failure rates maximise reliability? This paper presents various analysis
algorithms for parametric Markov chains and Markov decision processes. We focus
on three problems: (a) do all parameter values within a given region satisfy
?, (b) which regions satisfy and which ones do not?, and (c)
an approximate version of (b) focusing on covering a large fraction of all
possible parameter values. We give a detailed account of the various
algorithms, present a software tool realising these techniques, and report on
an extensive experimental evaluation on benchmarks that span a wide range of
applications.Comment: 38 page
- …