2,462 research outputs found
Free Energy and the Generalized Optimality Equations for Sequential Decision Making
The free energy functional has recently been proposed as a variational
principle for bounded rational decision-making, since it instantiates a natural
trade-off between utility gains and information processing costs that can be
axiomatically derived. Here we apply the free energy principle to general
decision trees that include both adversarial and stochastic environments. We
derive generalized sequential optimality equations that not only include the
Bellman optimality equations as a limit case, but also lead to well-known
decision-rules such as Expectimax, Minimax and Expectiminimax. We show how
these decision-rules can be derived from a single free energy principle that
assigns a resource parameter to each node in the decision tree. These resource
parameters express a concrete computational cost that can be measured as the
amount of samples that are needed from the distribution that belongs to each
node. The free energy principle therefore provides the normative basis for
generalized optimality equations that account for both adversarial and
stochastic environments.Comment: 10 pages, 2 figure
A Minimum Relative Entropy Principle for Learning and Acting
This paper proposes a method to construct an adaptive agent that is universal
with respect to a given class of experts, where each expert is an agent that
has been designed specifically for a particular environment. This adaptive
control problem is formalized as the problem of minimizing the relative entropy
of the adaptive agent from the expert that is most suitable for the unknown
environment. If the agent is a passive observer, then the optimal solution is
the well-known Bayesian predictor. However, if the agent is active, then its
past actions need to be treated as causal interventions on the I/O stream
rather than normal probability conditions. Here it is shown that the solution
to this new variational problem is given by a stochastic controller called the
Bayesian control rule, which implements adaptive behavior as a mixture of
experts. Furthermore, it is shown that under mild assumptions, the Bayesian
control rule converges to the control law of the most suitable expert.Comment: 36 pages, 11 figure
Abstraction in decision-makers with limited information processing capabilities
A distinctive property of human and animal intelligence is the ability to
form abstractions by neglecting irrelevant information which allows to separate
structure from noise. From an information theoretic point of view abstractions
are desirable because they allow for very efficient information processing. In
artificial systems abstractions are often implemented through computationally
costly formations of groups or clusters. In this work we establish the relation
between the free-energy framework for decision making and rate-distortion
theory and demonstrate how the application of rate-distortion for
decision-making leads to the emergence of abstractions. We argue that
abstractions are induced due to a limit in information processing capacity.Comment: Presented at the NIPS 2013 Workshop on Planning with Information
Constraint
A conversion between utility and information
Rewards typically express desirabilities or preferences over a set of
alternatives. Here we propose that rewards can be defined for any probability
distribution based on three desiderata, namely that rewards should be
real-valued, additive and order-preserving, where the latter implies that more
probable events should also be more desirable. Our main result states that
rewards are then uniquely determined by the negative information content. To
analyze stochastic processes, we define the utility of a realization as its
reward rate. Under this interpretation, we show that the expected utility of a
stochastic process is its negative entropy rate. Furthermore, we apply our
results to analyze agent-environment interactions. We show that the expected
utility that will actually be achieved by the agent is given by the negative
cross-entropy from the input-output (I/O) distribution of the coupled
interaction system and the agent's I/O distribution. Thus, our results allow
for an information-theoretic interpretation of the notion of utility and the
characterization of agent-environment interactions in terms of entropy
dynamics.Comment: AGI-2010. 6 pages, 1 figur
- …