2,522 research outputs found

    Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes

    Get PDF
    We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system. We obtain the confidence that the underlying system satisfies a given property, and show that the method uses data efficiently and thus is robust to the amount of data available. These characteristics are achieved by firstly exploiting parameter synthesis to establish a feasible set of parameters for which the underlying system will satisfy the property; secondly, by actively synthesising experiments to increase amount of information in the collected data that is relevant to the property; and finally propagating this information over the model parameters, obtaining a confidence that reflects our belief whether or not the system parameters lie in the feasible set, thereby solving the verification problem.Comment: QEST 2017, 18 pages, 7 figure

    Decision-theoretic planning with non-Markovian rewards

    No full text
    A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed a

    Optimisation of stochastic networks with blocking: a functional-form approach

    Full text link
    This paper introduces a class of stochastic networks with blocking, motivated by applications arising in cellular network planning, mobile cloud computing, and spare parts supply chains. Blocking results in lost revenue due to customers or jobs being permanently removed from the system. We are interested in striking a balance between mitigating blocking by increasing service capacity, and maintaining low costs for service capacity. This problem is further complicated by the stochastic nature of the system. Owing to the complexity of the system there are no analytical results available that formulate and solve the relevant optimization problem in closed form. Traditional simulation-based methods may work well for small instances, but the associated computational costs are prohibitive for networks of realistic size. We propose a hybrid functional-form based approach for finding the optimal resource allocation, combining the speed of an analytical approach with the accuracy of simulation-based optimisation. The key insight is to replace the computationally expensive gradient estimation in simulation optimisation with a closed-form analytical approximation that is calibrated using a single simulation run. We develop two implementations of this approach and conduct extensive computational experiments on complex examples to show that it is capable of substantially improving system performance. We also provide evidence that our approach has substantially lower computational costs compared to stochastic approximation

    Decision-Theoretic Planning with non-Markovian Rewards

    Full text link
    A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovian decision process (MDP) model. While the more tractable solution methods developed for MDPs do not directly apply in the presence of non-Markovian rewards, a number of solution methods for NMRDPs have been proposed in the literature. These all exploit a compact specification of the non-Markovian reward function in temporal logic, to automatically translate the NMRDP into an equivalent MDP which is solved using efficient MDP solution methods. This paper presents NMRDPP (Non-Markovian Reward Decision Process Planner), a software platform for the development and experimentation of methods for decision-theoretic planning with non-Markovian rewards. The current version of NMRDPP implements, under a single interface, a family of methods based on existing as well as new approaches which we describe in detail. These include dynamic programming, heuristic search, and structured methods. Using NMRDPP, we compare the methods and identify certain problem features that affect their performance. NMRDPPs treatment of non-Markovian rewards is inspired by the treatment of domain-specific search control knowledge in the TLPlan planner, which it incorporates as a special case. In the First International Probabilistic Planning Competition, NMRDPP was able to compete and perform well in both the domain-independent and hand-coded tracks, using search control knowledge in the latter
    • …
    corecore