1,531 research outputs found

    Percentile Queries in Multi-Dimensional Markov Decision Processes

    Full text link
    Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function ff, thresholds viv_i (one per dimension), and probability thresholds αi\alpha_i, we show how to compute a single strategy to enforce that for all dimensions ii, the probability of outcomes ρ\rho satisfying fi(ρ)vif_i(\rho) \geq v_i is at least αi\alpha_i. We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. in unweighted MDPs.Comment: Extended version of CAV 2015 pape

    Value Iteration for Long-run Average Reward in Markov Decision Processes

    Full text link
    Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

    The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

    Full text link
    We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regard- less of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub- optimal states can be removed. We show the natural decision problem associated to computing the NWR is coNP-complete. Finally, we ex- tend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR

    LNCS

    Get PDF
    Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly when treewidth is used as the parameter. This is in sharp contrast to qualitative objectives for MCs, MDPs and graph games, for which treewidth-based algorithms yield significant complexity improvements. In this work, we show that treewidth can also be used to obtain faster algorithms for the quantitative problems. For an MC with n states and m transitions, we show that each of the classical quantitative objectives can be computed in O((n+m)⋅t2) time, given a tree decomposition of the MC with width t. Our results also imply a bound of O(κ⋅(n+m)⋅t2) for each objective on MDPs, where κ is the number of strategy-iteration refinements required for the given input and objective. Finally, we make an experimental evaluation of our new algorithms on low-treewidth MCs and MDPs obtained from the DaCapo benchmark suite. Our experiments show that on low-treewidth MCs and MDPs, our algorithms outperform existing well-established methods by one or more orders of magnitude

    Maximizing the Conditional Expected Reward for Reaching the Goal

    Full text link
    The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.Comment: 103 pages, extended version with appendices of a paper accepted at TACAS 201

    LNCS

    Get PDF
    We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP∩coNP , matching the current known bound for single objectives; and in general the decision problem is PSPACE -hard and can be solved in NEXPTIME∩coNEXPTIME . We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies

    Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games

    Full text link
    Solving stochastic games with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, bounded value iteration (BVI) attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly end component (EC) computation that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, global propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighted graph and solve the widest path problem in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.Comment: v2: an URL to the implementation is adde

    Hot Jupiters from Secular Planet--Planet Interactions

    Full text link
    About 25 per cent of `hot Jupiters' (extrasolar Jovian-mass planets with close-in orbits) are actually orbiting counter to the spin direction of the star. Perturbations from a distant binary star companion can produce high inclinations, but cannot explain orbits that are retrograde with respect to the total angular momentum of the system. Such orbits in a stellar context can be produced through secular (that is, long term) perturbations in hierarchical triple-star systems. Here we report a similar analysis of planetary bodies, including both octupole-order effects and tidal friction, and find that we can produce hot Jupiters in orbits that are retrograde with respect to the total angular momentum. With distant stellar mass perturbers, such an outcome is not possible. With planetary perturbers, the inner orbit's angular momentum component parallel to the total angular momentum need not be constant. In fact, as we show here, it can even change sign, leading to a retrograde orbit. A brief excursion to very high eccentricity during the chaotic evolution of the inner orbit allows planet-star tidal interactions to rapidly circularize that orbit, decoupling the planets and forming a retrograde hot Jupiter.Comment: accepted for publication by Nature, 3 figures (version after proof - some typos corrected

    Characterisation of the Mouse Vasoactive Intestinal Peptide Receptor Type 2 Gene, Vipr2, and Identification of a Polymorphic LINE-1-like Sequence That Confers Altered Promoter Activity

    Get PDF
    The VPAC(2) receptor is a seven transmembrane spanning G protein-coupled receptor for two neuropeptides, vasoactive intestinal peptide (VIP) and pituitary adenylate cyclase-activating polypeptide (PACAP). It has a distinct tissue-specific, developmental and inducible expression that underlies an important neuroendocrine role. Here, we report the characterisation of the gene that encodes the mouse VPAC(2) receptor (Vipr2), localisation of the transcriptional start site and functional analysis of the promoter region. The Vipr2 gene contains 12 introns within its protein-coding region and spans 68.6 kb. Comparison of the 5′ untranslated region sequences for cloned 5′-RACE products amplified from different tissues showed they all were contained within the same exon, with the longest extending 111 bp upstream of the ATG start site. Functional analysis of the 3.2-kb 5′-flanking region using sequentially deleted sequences cloned into a luciferase gene reporter vector revealed that this region is active as a promoter in mouse AtT20 D16:16 and rat GH4C1 cell lines. The core promoter is located within a 180-bp GC-rich region proximal to the ATG start codon and contains potential binding sites for Sp1 and AP2, but no TATA-box. Further upstream, in two out of three mice strains examined, we have discovered a 496-bp polymorphic DNA sequence that bears a significant identity to mouse LINE-1 DNA. Comparison of the promoter activity between luciferase reporter gene constructs derived from the BALB/c (which contains this sequence) and C57BL/6J (which lacks this sequence) Vipr2 promoter regions has shown three-fold difference in luciferase gene activity when expressed in mouse AtT20 D16:16 and αT3-1 cells, but not when expressed in the rat GH4C1 cells or in COS 7 cells. Our results suggest that the mouse Vipr2 gene may be differentially active in different mouse strains, depending on the presence of this LINE-1-like sequence in the promoter region

    Early Infant Diagnosis of HIV in Three Regions in Tanzania; Successes and Challenges.

    Get PDF
    By the end of 2009 an estimated 2.5 million children worldwide were living with HIV-1, mostly as a consequence of vertical transmission, and more than 90% of these children live in sub-Saharan Africa. In 2008 the World Health Organization (WHO), recommended early initiation of Highly Active Antiretroviral Therapy (HAART) to all HIV infected infants diagnosed within the first year of life, and since 2010, within the first two years of life, irrespective of CD4 count or WHO clinical stage. The study aims were to describe implementation of EID programs in three Tanzanian regions with differences in HIV prevalences and logistical set-up with regard to HIV DNA testing. Data were obtained by review of the prevention from mother to child transmission of HIV (PMTCT) registers from 2009-2011 at the Reproductive and Child Health Clinics (RCH) and from the databases from the Care and Treatment Clinics (CTC) in all the three regions; Kilimanjaro, Mbeya and Tanga. Statistical tests used were Poisson regression model and rank sum test. During the period of 2009 - 2011 a total of 4,860 exposed infants were registered from the reviewed sites, of whom 4,292 (88.3%) were screened for HIV infection. Overall proportion of tested infants in the three regions increased from 77.2% in 2009 to 97.8% in 2011. A total of 452 (10.5%) were found to be HIV infected (judged by the result of the first test). The prevalence of HIV infection among infants was higher in Mbeya when compared to Kilimanjaro region RR = 1.872 (95%CI = 1.408 - 2.543) p < 0.001. However sample turnaround time was significantly shorter in both Mbeya (2.7 weeks) and Tanga (5.0 weeks) as compared to Kilimanjaro (7.0 weeks), p=<0.001. A substantial of loss to follow-up (LTFU) was evident at all stages of EID services in the period of 2009 to 2011. Among the infants who were receiving treatment, 61% were found to be LFTU during the review period. The study showed an increase in testing of HIV exposed infants within the three years, there is large variations of HIV prevalence among the regions. Challenges like; sample turnaround time and LTFU must be overcome before this can translate into the intended goal of early initiation of lifelong lifesaving antiretroviral therapy for the infants
    corecore