2,849 research outputs found

    Sensor Synthesis for POMDPs with Reachability Objectives

    Full text link
    Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors. We study a setting in which the sensors are only partially defined and the goal is to synthesize "weakest" additional sensors, such that in the resulting POMDP, there is a small-memory policy for the agent that almost-surely (with probability~1) satisfies a reachability objective. We show that the problem is NP-complete, and present a symbolic algorithm by encoding the problem into SAT instances. We illustrate trade-offs between the amount of memory of the policy and the number of additional sensors on a simple example. We have implemented our approach and consider three classical POMDP examples from the literature, and show that in all the examples the number of sensors can be significantly decreased (as compared to the existing solutions in the literature) without increasing the complexity of the policies.Comment: arXiv admin note: text overlap with arXiv:1511.0845

    POMDPs under Probabilistic Semantics

    Full text link
    We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold lambda_1 in (0,1]; and (ii) qualitative constraint which is a special case of quantitative constraint with lambda_1=1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

    Limit Your Consumption! Finding Bounds in Average-energy Games

    Get PDF
    Energy games are infinite two-player games played in weighted arenas with quantitative objectives that restrict the consumption of a resource modeled by the weights, e.g., a battery that is charged and drained. Typically, upper and/or lower bounds on the battery capacity are part of the problem description. Here, we consider the problem of determining upper bounds on the average accumulated energy or on the capacity while satisfying a given lower bound, i.e., we do not determine whether a given bound is sufficient to meet the specification, but if there exists a sufficient bound to meet it. In the classical setting with positive and negative weights, we show that the problem of determining the existence of a sufficient bound on the long-run average accumulated energy can be solved in doubly-exponential time. Then, we consider recharge games: here, all weights are negative, but there are recharge edges that recharge the energy to some fixed capacity. We show that bounding the long-run average energy in such games is complete for exponential time. Then, we consider the existential version of the problem, which turns out to be solvable in polynomial time: here, we ask whether there is a recharge capacity that allows the system player to win the game. We conclude by studying tradeoffs between the memory needed to implement strategies and the bounds they realize. We give an example showing that memory can be traded for bounds and vice versa. Also, we show that increasing the capacity allows to lower the average accumulated energy.Comment: In Proceedings QAPL'16, arXiv:1610.0769

    Two Variable vs. Linear Temporal Logic in Model Checking and Games

    Full text link
    Model checking linear-time properties expressed in first-order logic has non-elementary complexity, and thus various restricted logical languages are employed. In this paper we consider two such restricted specification logics, linear temporal logic (LTL) and two-variable first-order logic (FO2). LTL is more expressive but FO2 can be more succinct, and hence it is not clear which should be easier to verify. We take a comprehensive look at the issue, giving a comparison of verification problems for FO2, LTL, and various sublogics thereof across a wide range of models. In particular, we look at unary temporal logic (UTL), a subset of LTL that is expressively equivalent to FO2; we also consider the stutter-free fragment of FO2, obtained by omitting the successor relation, and the expressively equivalent fragment of UTL, obtained by omitting the next and previous connectives. We give three logic-to-automata translations which can be used to give upper bounds for FO2 and UTL and various sublogics. We apply these to get new bounds for both non-deterministic systems (hierarchical and recursive state machines, games) and for probabilistic systems (Markov chains, recursive Markov chains, and Markov decision processes). We couple these with matching lower-bound arguments. Next, we look at combining FO2 verification techniques with those for LTL. We present here a language that subsumes both FO2 and LTL, and inherits the model checking properties of both languages. Our results give both a unified approach to understanding the behaviour of FO2 and LTL, along with a nearly comprehensive picture of the complexity of verification for these logics and their sublogics.Comment: 37 pages, to be published in Logical Methods in Computer Science journal, includes material presented in Concur 2011 and QEST 2012 extended abstract

    Parametric LTL on Markov Chains

    Full text link
    This paper is concerned with the verification of finite Markov chains against parametrized LTL (pLTL) formulas. In pLTL, the until-modality is equipped with a bound that contains variables; e.g., ◊≤x φ\Diamond_{\le x}\ \varphi asserts that φ\varphi holds within xx time steps, where xx is a variable on natural numbers. The central problem studied in this paper is to determine the set of parameter valuations V≺p(φ)V_{\prec p} (\varphi) for which the probability to satisfy pLTL-formula φ\varphi in a Markov chain meets a given threshold ≺p\prec p, where ≺\prec is a comparison on reals and pp a probability. As for pLTL determining the emptiness of V>0(φ)V_{> 0}(\varphi) is undecidable, we consider several logic fragments. We consider parametric reachability properties, a sub-logic of pLTL restricted to next and ◊≤x\Diamond_{\le x}, parametric B\"uchi properties and finally, a maximal subclass of pLTL for which emptiness of V>0(φ)V_{> 0}(\varphi) is decidable.Comment: TCS Track B 201

    Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial

    Full text link
    Recent results of Ye and Hansen, Miltersen and Zwick show that policy iteration for one or two player (perfect information) zero-sum stochastic games, restricted to instances with a fixed discount rate, is strongly polynomial. We show that policy iteration for mean-payoff zero-sum stochastic games is also strongly polynomial when restricted to instances with bounded first mean return time to a given state. The proof is based on methods of nonlinear Perron-Frobenius theory, allowing us to reduce the mean-payoff problem to a discounted problem with state dependent discount rate. Our analysis also shows that policy iteration remains strongly polynomial for discounted problems in which the discount rate can be state dependent (and even negative) at certain states, provided that the spectral radii of the nonnegative matrices associated to all strategies are bounded from above by a fixed constant strictly less than 1.Comment: 17 page
    • …
    corecore