Search CORE

1,056 research outputs found

Metareasoning for Planning Under Uncertainty

Author: Horvitz Eric
Kamar Ece
Kolobov Andrey
Lin Christopher H.
Publication venue
Publication date: 03/05/2015
Field of study

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most real-world settings. For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.Comment: Extended version of IJCAI 2015 pape

arXiv.org e-Print Archive

CiteSeerX

Cutset Sampling for Bayesian Networks

Author: Bidyuk B.
Dechter R.
Publication venue: 'AI Access Foundation'
Publication date: 12/10/2011
Field of study

The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structure-exploiting application of the Rao-Blackwellisation principle to sampling in Bayesian networks. It improves convergence by exploiting memory-based inference algorithms. It can also be viewed as an anytime approximation of the exact cutset-conditioning algorithm developed by Pearl. Cutset sampling can be implemented efficiently when the sampled variables constitute a loop-cutset of the Bayesian network and, more generally, when the induced width of the networks graph conditioned on the observed sampled variables is bounded by a constant w. We demonstrate empirically the benefit of this scheme on a range of benchmarks

arXiv.org e-Print Archive

Crossref

Parameter-Independent Strategies for pMDPs via POMDPs

Author: A Lukina
C Baier
C Baier
C Daws
C Dehnert
C Dehnert
D Beyer
E Bartocci
E Polgreen
EM Hahn
EM Hahn
J Aspnes
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
LI Sennott
M Baldi
M Cubuktepe
M Kwiatkowska
MTJ Spaan
N Jansen
O Madani
PR Halmos
R Lanotte
S Pathak
S Russell
T Quatmann
V Kreinovich
Publication venue
Publication date: 01/01/2018
Field of study

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs. We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.Comment: Extended version of a QEST 2018 pape

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

IST Austria: PubRep (Institute of Science and Technology)

Generalized Evidence Theory

Author: Deng Yong
Publication venue
Publication date: 17/04/2014
Field of study

Conflict management is still an open issue in the application of Dempster Shafer evidence theory. A lot of works have been presented to address this issue. In this paper, a new theory, called as generalized evidence theory (GET), is proposed. Compared with existing methods, GET assumes that the general situation is in open world due to the uncertainty and incomplete knowledge. The conflicting evidence is handled under the framework of GET. It is shown that the new theory can explain and deal with the conflicting evidence in a more reasonable way.Comment: 39 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Thermodynamics as a theory of decision-making with information processing costs

Author: Başar T
Bellman RE
Bishop CM
Braun DA
Callen HB
Camerer C
Daw ND
de Finetti B
Feynman RP
Gigerenzer G
Gigerenzer G
Gladwell M
Gumbel EJ
Jaynes ET
Kahnemann D
Kolmogorov A
Luce RD
Luce RD
MacKay DJC
McFadden D
Meginnis JR
Ortega PA
Ortega PA
Ortega PA
Ortega PA
Peters J
Rawlik K
Rubinstein A
Russell SJ
Savage LJ
Simon H
Simon H
Stone LD
Sutton RS
Theodorou E
Tishby N
Todorov E
van den Broek JL
Vitanyi PMB
Von Neumann J
Whittle P
Wolpert D
Wolpert DH
Publication venue: 'The Royal Society'
Publication date: 30/07/2012
Field of study

Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we propose an information-theoretic formalization of bounded rational decision-making where decision-makers trade off expected utility and information processing costs. Such bounded rational decision-makers can be thought of as thermodynamic machines that undergo physical state changes when they compute. Their behavior is governed by a free energy functional that trades off changes in internal energy-as a proxy for utility-and entropic changes representing computational costs induced by changing states. As a result, the bounded rational decision-making problem can be rephrased in terms of well-known concepts from statistical physics. In the limit when computational costs are ignored, the maximum expected utility principle is recovered. We discuss the relation to satisficing decision-making procedures as well as links to existing theoretical frameworks and human decision-making experiments that describe deviations from expected utility theory. Since most of the mathematical machinery can be borrowed from statistical physics, the main contribution is to axiomatically derive and interpret the thermodynamic free energy as a model of bounded rational decision-making.Comment: 26 pages, 5 figures, (under revision since February 2012

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Approximate Assertional Reasoning Over Expressive Ontologies

Author: Tserendorj Tuvshintur
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

In this thesis, approximate reasoning methods for scalable assertional reasoning are provided whose computational properties can be established in a well-understood way, namely in terms of soundness and completeness, and whose quality can be analyzed in terms of statistical measurements, namely recall and precision. The basic idea of these approximate reasoning methods is to speed up reasoning by trading off the quality of reasoning results against increased speed

KITopen