10,221 research outputs found
Free Energy and the Generalized Optimality Equations for Sequential Decision Making
The free energy functional has recently been proposed as a variational
principle for bounded rational decision-making, since it instantiates a natural
trade-off between utility gains and information processing costs that can be
axiomatically derived. Here we apply the free energy principle to general
decision trees that include both adversarial and stochastic environments. We
derive generalized sequential optimality equations that not only include the
Bellman optimality equations as a limit case, but also lead to well-known
decision-rules such as Expectimax, Minimax and Expectiminimax. We show how
these decision-rules can be derived from a single free energy principle that
assigns a resource parameter to each node in the decision tree. These resource
parameters express a concrete computational cost that can be measured as the
amount of samples that are needed from the distribution that belongs to each
node. The free energy principle therefore provides the normative basis for
generalized optimality equations that account for both adversarial and
stochastic environments.Comment: 10 pages, 2 figure
Variational optimization of probability measure spaces resolves the chain store paradox
In game theory, players have continuous expected payoff functions and can use
fixed point theorems to locate equilibria. This optimization method requires
that players adopt a particular type of probability measure space. Here, we
introduce alternate probability measure spaces altering the dimensionality,
continuity, and differentiability properties of what are now the game's
expected payoff functionals. Optimizing such functionals requires generalized
variational and functional optimization methods to locate novel equilibria.
These variational methods can reconcile game theoretic prediction and observed
human behaviours, as we illustrate by resolving the chain store paradox. Our
generalized optimization analysis has significant implications for economics,
artificial intelligence, complex system theory, neurobiology, and biological
evolution and development.Comment: 11 pages, 5 figures. Replaced for minor notational correctio
Econometrics for Learning Agents
The main goal of this paper is to develop a theory of inference of player
valuations from observed data in the generalized second price auction without
relying on the Nash equilibrium assumption. Existing work in Economics on
inferring agent values from data relies on the assumption that all participant
strategies are best responses of the observed play of other players, i.e. they
constitute a Nash equilibrium. In this paper, we show how to perform inference
relying on a weaker assumption instead: assuming that players are using some
form of no-regret learning. Learning outcomes emerged in recent years as an
attractive alternative to Nash equilibrium in analyzing game outcomes, modeling
players who haven't reached a stable equilibrium, but rather use algorithmic
learning, aiming to learn the best way to play from previous observations. In
this paper we show how to infer values of players who use algorithmic learning
strategies. Such inference is an important first step before we move to testing
any learning theoretic behavioral model on auction data. We apply our
techniques to a dataset from Microsoft's sponsored search ad auction system
- …