Search CORE

27 research outputs found

An Adversarial Interpretation of Information-Theoretic Bounded Rationality

Author: Lee Daniel D.
Ortega Pedro A.
Publication venue
Publication date: 22/04/2014
Field of study

Recently, there has been a growing interest in modeling planning with information constraints. Accordingly, an agent maximizes a regularized expected utility known as the free energy, where the regularizer is given by the information divergence from a prior to a posterior policy. While this approach can be justified in various ways, including from statistical mechanics and information theory, it is still unclear how it relates to decision-making against adversarial environments. This connection has previously been suggested in work relating the free energy to risk-sensitive control and to extensive form games. Here, we show that a single-agent free energy optimization is equivalent to a game between the agent and an imaginary adversary. The adversary can, by paying an exponential penalty, generate costs that diminish the decision maker's payoffs. It turns out that the optimal strategy of the adversary consists in choosing costs so as to render the decision maker indifferent among its choices, which is a definining property of a Nash equilibrium, thus tightening the connection between free energy optimization and game theory.Comment: 7 pages, 4 figures. Proceedings of AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Universal Convexification via Risk-Aversion

Author: Dvijotham Krishnamurthy
Fazel Maryam
Todorov Emanuel
Publication venue
Publication date: 02/06/2014
Field of study

We develop a framework for convexifying a fairly general class of optimization problems. Under additional assumptions, we analyze the suboptimality of the solution to the convexified problem relative to the original nonconvex problem and prove additive approximation guarantees. We then develop algorithms based on stochastic gradient methods to solve the resulting optimization problems and show bounds on convergence rates. %We show a simple application of this framework to supervised learning, where one can perform integration explicitly and can use standard (non-stochastic) optimization algorithms with better convergence guarantees. We then extend this framework to apply to a general class of discrete-time dynamical systems. In this context, our convexification approach falls under the well-studied paradigm of risk-sensitive Markov Decision Processes. We derive the first known model-based and model-free policy gradient optimization algorithms with guaranteed convergence to the optimal solution. Finally, we present numerical results validating our formulation in different applications

arXiv.org e-Print Archive

CiteSeerX

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Author: A Geramifard
A Guez
A Nilim
AL Strehl
D Bertsekas
E Todorov
GN Iyengar
HJ Kappen
J Rubin
KJ Åström
LP Hansen
N Tishby
PA Ortega
PA Ortega
PA Ortega
S Mannor
S Ross
W Wiesemann
Y Shen
Publication venue
Publication date: 07/04/2016
Field of study

Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.Comment: 16 pages, 3 figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe