Search CORE

1,531 research outputs found

Percentile Queries in Multi-Dimensional Markov Decision Processes

Author: C Baier
C Haase
C Wu
DJ White
DP Bertsekas
JA Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Etessami
L Alfaro de
M Randour
M Sakaguchi
M Ummels
Michael R Garey
ML Puterman
O Goldreich
S Toda
SD Travers
T Brázdil
U Boker
Y Ohtsubo
Publication venue
Publication date: 01/01/2015
Field of study

Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function

f

, thresholds

v_i

(one per dimension), and probability thresholds

\alpha_i

, we show how to compute a single strategy to enforce that for all dimensions

i

, the probability of outcomes

\rho

satisfying

f_i(\rho) \geq v_i

is at least

\alpha_i

. We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. in unweighted MDPs.Comment: Extended version of CAV 2015 pape

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

DI-fusion

HAL-Rennes 1

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 13/07/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

Author: AL Strehl
C Baier
C Courcoubetis
C Dehnert
Krishnendu Chatterjee
L Valiant
LP Kaelbling
M Kwiatkowska
M Steinmetz
ML Puterman
N Fijalkow
PR D’Argenio
S Fortune
SJ Russell
T Brázdil
T Eilam-Tzoreff
Publication venue
Publication date: 01/01/2018
Field of study

We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regard- less of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub- optimal states can be removed. We show the natural decision problem associated to computing the NWR is coNP-complete. Finally, we ex- tend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR

arXiv.org e-Print Archive

Crossref

Institutional Repository Universiteit Antwerpen

DI-fusion

LNCS

Author: A Ferrara
AK Goharshady
C Daws
C Dehnert
EA Feinberg
EM Hahn
FV Fomin
HL Bodlaender
J Fearnley
J Křetínský
J Obdržálek
JR Norris
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Kwiatkowska
M Thorup
ML Puterman
N Robertson
R Bellman
T Quatmann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly when treewidth is used as the parameter. This is in sharp contrast to qualitative objectives for MCs, MDPs and graph games, for which treewidth-based algorithms yield significant complexity improvements. In this work, we show that treewidth can also be used to obtain faster algorithms for the quantitative problems. For an MC with n states and m transitions, we show that each of the classical quantitative objectives can be computed in O((n+m)⋅t2) time, given a tree decomposition of the MC with width t. Our results also imply a bound of O(κ⋅(n+m)⋅t2) for each objective on MDPs, where κ is the number of strategy-iteration refinements required for the given input and objective. Finally, we make an experimental evaluation of our new algorithms on low-treewidth MCs and MDPs obtained from the DaCapo benchmark suite. Our experiments show that on low-treewidth MCs and MDPs, our algorithms outperform existing well-established methods by one or more orders of magnitude

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Maximizing the Conditional Expected Reward for Reaching the Goal

Author: C Acerbi
C Baier
C Baier
C Baier
DP Bertsekas
F Gretz
G Barthe
G Seber
J-P Katoen
K Chatterjee
K Chatzikokolakis
L Alfaro
L Kallenberg
M Kwiatkowska
M Randour
ME Andrés
ME Andrés
ML Puterman
MS Alvim
T Brázdil
Publication venue
Publication date: 19/01/2017
Field of study

The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.Comment: 103 pages, extended version with appendices of a paper accepted at TACAS 201

arXiv.org e-Print Archive

Crossref

LNCS

Author: A Condon
A Hartmanns
A Tarski
AJ Hoffman
C Baier
C Baier
C Dehnert
DM Roijers
E Altman
F Delgrange
J Filar
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
L Blume
M Kwiatkowska
M Randour
M Svorenová
ML Puterman
N Basset
PC Fishburn
R Bloem
T Brázdil
T Chen
T Chen
T Quatmann
V Bruyère
V Forejt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP∩coNP , matching the current known bound for single objectives; and in general the decision problem is PSPACE -hard and can be solved in NEXPTIME∩coNEXPTIME . We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Publikationsserver der RWTH Aachen University

Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games

Author: A Condon
A McIver
AJ Hoffman
C Baier
C Courcoubetis
D Andersson
E Kelmendi
K Chatterjee
K Chatterjee
M Kwiatkowska
M Svorenová
ML Fredman
P Ashok
R Calinescu
S Haddad
T Brázdil
T Chen
T Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2020
Field of study

Solving stochastic games with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, bounded value iteration (BVI) attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly end component (EC) computation that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, global propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighted graph and solve the widest path problem in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.Comment: v2: an URL to the implementation is adde

arXiv.org e-Print Archive

Crossref

Hot Jupiters from Secular Planet--Planet Interactions

Author: AA Zdziarski
AHMJ Triaud
BS Gaudi
C Marois
D Fabrycky
DNC Lin
EB Ford
EB Ford
Frederic A. Rasio
FS Masset
G Takeda
JB Pollack
Jean Teyssandier
JN Winn
KC Schlaufman
LG Kiseleva
M Holman
M Nagasawa
ML Lidov
P Kalas
PP Eggleton
RS Harrington
S Chatterjee
S Matsumura
S Mikkola
Smadar Naoz
T Mazeh
Will M. Farr
Y Kozai
Y Krymolowski
Y Wu
Yoram Lithwick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/05/2011
Field of study

About 25 per cent of `hot Jupiters' (extrasolar Jovian-mass planets with close-in orbits) are actually orbiting counter to the spin direction of the star. Perturbations from a distant binary star companion can produce high inclinations, but cannot explain orbits that are retrograde with respect to the total angular momentum of the system. Such orbits in a stellar context can be produced through secular (that is, long term) perturbations in hierarchical triple-star systems. Here we report a similar analysis of planetary bodies, including both octupole-order effects and tidal friction, and find that we can produce hot Jupiters in orbits that are retrograde with respect to the total angular momentum. With distant stellar mass perturbers, such an outcome is not possible. With planetary perturbers, the inner orbit's angular momentum component parallel to the total angular momentum need not be constant. In fact, as we show here, it can even change sign, leading to a retrograde orbit. A brief excursion to very high eccentricity during the chaotic evolution of the inner orbit allows planet-star tidal interactions to rapidly circularize that orbit, decoupling the planets and forming a retrograde hot Jupiter.Comment: accepted for publication by Nature, 3 figures (version after proof - some typos corrected

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Characterisation of the Mouse Vasoactive Intestinal Peptide Receptor Type 2 Gene, Vipr2, and Identification of a Polymorphic LINE-1-like Sequence That Confers Altered Promoter Activity

Author: Asano E
Chatterjee TK
E. M. Lutz
G. Steel
Gozes I
Harmar AJ
Hezareh M
McCuaig KA
McCulloch D
Mears ML
Mount SM
Pei L
Sheward WJ
Shrivastava A
Zolnierowicz S
Publication venue: Blackwell Publishing Ltd
Publication date: 01/01/2007
Field of study

The VPAC(2) receptor is a seven transmembrane spanning G protein-coupled receptor for two neuropeptides, vasoactive intestinal peptide (VIP) and pituitary adenylate cyclase-activating polypeptide (PACAP). It has a distinct tissue-specific, developmental and inducible expression that underlies an important neuroendocrine role. Here, we report the characterisation of the gene that encodes the mouse VPAC(2) receptor (Vipr2), localisation of the transcriptional start site and functional analysis of the promoter region. The Vipr2 gene contains 12 introns within its protein-coding region and spans 68.6 kb. Comparison of the 5′ untranslated region sequences for cloned 5′-RACE products amplified from different tissues showed they all were contained within the same exon, with the longest extending 111 bp upstream of the ATG start site. Functional analysis of the 3.2-kb 5′-flanking region using sequentially deleted sequences cloned into a luciferase gene reporter vector revealed that this region is active as a promoter in mouse AtT20 D16:16 and rat GH4C1 cell lines. The core promoter is located within a 180-bp GC-rich region proximal to the ATG start codon and contains potential binding sites for Sp1 and AP2, but no TATA-box. Further upstream, in two out of three mice strains examined, we have discovered a 496-bp polymorphic DNA sequence that bears a significant identity to mouse LINE-1 DNA. Comparison of the promoter activity between luciferase reporter gene constructs derived from the BALB/c (which contains this sequence) and C57BL/6J (which lacks this sequence) Vipr2 promoter regions has shown three-fold difference in luciferase gene activity when expressed in mouse AtT20 D16:16 and αT3-1 cells, but not when expressed in the rat GH4C1 cells or in COS 7 cells. Our results suggest that the mouse Vipr2 gene may be differentially active in different mouse strains, depending on the presence of this LINE-1-like sequence in the promoter region

Crossref

PubMed Central

Early Infant Diagnosis of HIV in Three Regions in Tanzania; Successes and Challenges.

By the end of 2009 an estimated 2.5 million children worldwide were living with HIV-1, mostly as a consequence of vertical transmission, and more than 90% of these children live in sub-Saharan Africa. In 2008 the World Health Organization (WHO), recommended early initiation of Highly Active Antiretroviral Therapy (HAART) to all HIV infected infants diagnosed within the first year of life, and since 2010, within the first two years of life, irrespective of CD4 count or WHO clinical stage. The study aims were to describe implementation of EID programs in three Tanzanian regions with differences in HIV prevalences and logistical set-up with regard to HIV DNA testing. Data were obtained by review of the prevention from mother to child transmission of HIV (PMTCT) registers from 2009-2011 at the Reproductive and Child Health Clinics (RCH) and from the databases from the Care and Treatment Clinics (CTC) in all the three regions; Kilimanjaro, Mbeya and Tanga. Statistical tests used were Poisson regression model and rank sum test. During the period of 2009 - 2011 a total of 4,860 exposed infants were registered from the reviewed sites, of whom 4,292 (88.3%) were screened for HIV infection. Overall proportion of tested infants in the three regions increased from 77.2% in 2009 to 97.8% in 2011. A total of 452 (10.5%) were found to be HIV infected (judged by the result of the first test). The prevalence of HIV infection among infants was higher in Mbeya when compared to Kilimanjaro region RR = 1.872 (95%CI = 1.408 - 2.543) p < 0.001. However sample turnaround time was significantly shorter in both Mbeya (2.7 weeks) and Tanga (5.0 weeks) as compared to Kilimanjaro (7.0 weeks), p=<0.001. A substantial of loss to follow-up (LTFU) was evident at all stages of EID services in the period of 2009 to 2011. Among the infants who were receiving treatment, 61% were found to be LFTU during the review period. The study showed an increase in testing of HIV exposed infants within the three years, there is large variations of HIV prevalence among the regions. Challenges like; sample turnaround time and LTFU must be overcome before this can translate into the intended goal of early initiation of lifelong lifesaving antiretroviral therapy for the infants

Crossref

Springer - Publisher Connector

Copenhagen University Research Information System

PubMed Central