27,370 research outputs found
An Adversarial Interpretation of Information-Theoretic Bounded Rationality
Recently, there has been a growing interest in modeling planning with
information constraints. Accordingly, an agent maximizes a regularized expected
utility known as the free energy, where the regularizer is given by the
information divergence from a prior to a posterior policy. While this approach
can be justified in various ways, including from statistical mechanics and
information theory, it is still unclear how it relates to decision-making
against adversarial environments. This connection has previously been suggested
in work relating the free energy to risk-sensitive control and to extensive
form games. Here, we show that a single-agent free energy optimization is
equivalent to a game between the agent and an imaginary adversary. The
adversary can, by paying an exponential penalty, generate costs that diminish
the decision maker's payoffs. It turns out that the optimal strategy of the
adversary consists in choosing costs so as to render the decision maker
indifferent among its choices, which is a definining property of a Nash
equilibrium, thus tightening the connection between free energy optimization
and game theory.Comment: 7 pages, 4 figures. Proceedings of AAAI-1
Bounded Rational Decision-Making in Changing Environments
A perfectly rational decision-maker chooses the best action with the highest
utility gain from a set of possible actions. The optimality principles that
describe such decision processes do not take into account the computational
costs of finding the optimal action. Bounded rational decision-making addresses
this problem by specifically trading off information-processing costs and
expected utility. Interestingly, a similar trade-off between energy and entropy
arises when describing changes in thermodynamic systems. This similarity has
been recently used to describe bounded rational agents. Crucially, this
framework assumes that the environment does not change while the decision-maker
is computing the optimal policy. When this requirement is not fulfilled, the
decision-maker will suffer inefficiencies in utility, that arise because the
current policy is optimal for an environment in the past. Here we borrow
concepts from non-equilibrium thermodynamics to quantify these inefficiencies
and illustrate with simulations its relationship with computational resources.Comment: 9 pages, 2 figures, NIPS 2013 Workshop on Planning with Information
Constraint
Modeling rationality to control self-organization of crowds: An environmental approach
In this paper we propose a classification of crowd models in built
environments based on the assumed pedestrian ability to foresee the movements
of other walkers. At the same time, we introduce a new family of macroscopic
models, which make it possible to tune the degree of predictiveness (i.e.,
rationality) of the individuals. By means of these models we describe both the
natural behavior of pedestrians, i.e., their expected behavior according to
their real limited predictive ability, and a target behavior, i.e., a
particularly efficient behavior one would like them to assume (for, e.g.,
logistic or safety reasons). Then we tackle a challenging shape optimization
problem, which consists in controlling the environment in such a way that the
natural behavior is as close as possible to the target one, thereby inducing
pedestrians to behave more rationally than what they would naturally do. We
present numerical tests which elucidate the role of rational/predictive
abilities and show some promising results about the shape optimization problem
Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes
Information-theoretic principles for learning and acting have been proposed
to solve particular classes of Markov Decision Problems. Mathematically, such
approaches are governed by a variational free energy principle and allow
solving MDP planning problems with information-processing constraints expressed
in terms of a Kullback-Leibler divergence with respect to a reference
distribution. Here we consider a generalization of such MDP planners by taking
model uncertainty into account. As model uncertainty can also be formalized as
an information-processing constraint, we can derive a unified solution from a
single generalized variational principle. We provide a generalized value
iteration scheme together with a convergence proof. As limit cases, this
generalized scheme includes standard value iteration with a known model,
Bayesian MDP planning, and robust planning. We demonstrate the benefits of this
approach in a grid world simulation.Comment: 16 pages, 3 figure
An information-theoretic on-line update principle for perception-action coupling
Inspired by findings of sensorimotor coupling in humans and animals, there
has recently been a growing interest in the interaction between action and
perception in robotic systems [Bogh et al., 2016]. Here we consider perception
and action as two serial information channels with limited
information-processing capacity. We follow [Genewein et al., 2015] and
formulate a constrained optimization problem that maximizes utility under
limited information-processing capacity in the two channels. As a solution we
obtain an optimal perceptual channel and an optimal action channel that are
coupled such that perceptual information is optimized with respect to
downstream processing in the action module. The main novelty of this study is
that we propose an online optimization procedure to find bounded-optimal
perception and action channels in parameterized serial perception-action
systems. In particular, we implement the perceptual channel as a multi-layer
neural network and the action channel as a multinomial distribution. We
illustrate our method in a NAO robot simulator with a simplified cup lifting
task.Comment: 8 pages, 2017 IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS
Free Energy and the Generalized Optimality Equations for Sequential Decision Making
The free energy functional has recently been proposed as a variational
principle for bounded rational decision-making, since it instantiates a natural
trade-off between utility gains and information processing costs that can be
axiomatically derived. Here we apply the free energy principle to general
decision trees that include both adversarial and stochastic environments. We
derive generalized sequential optimality equations that not only include the
Bellman optimality equations as a limit case, but also lead to well-known
decision-rules such as Expectimax, Minimax and Expectiminimax. We show how
these decision-rules can be derived from a single free energy principle that
assigns a resource parameter to each node in the decision tree. These resource
parameters express a concrete computational cost that can be measured as the
amount of samples that are needed from the distribution that belongs to each
node. The free energy principle therefore provides the normative basis for
generalized optimality equations that account for both adversarial and
stochastic environments.Comment: 10 pages, 2 figure
Abstraction in decision-makers with limited information processing capabilities
A distinctive property of human and animal intelligence is the ability to
form abstractions by neglecting irrelevant information which allows to separate
structure from noise. From an information theoretic point of view abstractions
are desirable because they allow for very efficient information processing. In
artificial systems abstractions are often implemented through computationally
costly formations of groups or clusters. In this work we establish the relation
between the free-energy framework for decision making and rate-distortion
theory and demonstrate how the application of rate-distortion for
decision-making leads to the emergence of abstractions. We argue that
abstractions are induced due to a limit in information processing capacity.Comment: Presented at the NIPS 2013 Workshop on Planning with Information
Constraint
The Simonian bounded rationality hypothesis and the expectation formation mechanism
Abstract. In the 1980s and at beginning of the 1990s the debate on expectation formation mechanism was dominated by the rational expectation hypothesis. Later on, more interest was directed towards alternative approaches to expectations analysis, mainly based on the bounded rationality paradigm introduced earlier by Herbert A. Simon. The bounded rationality approach is used here to describe the way expectations might be formed by different agents. Furthermore, three main hypotheses, namely adaptive, rational and bounded ones are being compared and used to indicate why time lags in economic policy prevail and are variable. JEL Codes: D78, D84, H30, E00.Keywords: bounded rationality, substantive and procedural rationality, expectation formation, adaptive and rational expectations, time lags
- …