Search CORE

10,221 research outputs found

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

Author: Braun Daniel A.
Ortega Pedro A.
Publication venue
Publication date: 17/05/2012
Field of study

The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.Comment: 10 pages, 2 figure

arXiv.org e-Print Archive

MPG.PuRe

Evolutionary Game Theory

Author: Alexander James
Publication venue
Publication date: 01/01/2003
Field of study

Articl

SAS-SPACE

Variational optimization of probability measure spaces resolves the chain store paradox

Author: Gagen Michael J.
Nemoto Kae
Publication venue
Publication date: 10/05/2006
Field of study

In game theory, players have continuous expected payoff functions and can use fixed point theorems to locate equilibria. This optimization method requires that players adopt a particular type of probability measure space. Here, we introduce alternate probability measure spaces altering the dimensionality, continuity, and differentiability properties of what are now the game's expected payoff functionals. Optimizing such functionals requires generalized variational and functional optimization methods to locate novel equilibria. These variational methods can reconcile game theoretic prediction and observed human behaviours, as we illustrate by resolving the chain store paradox. Our generalized optimization analysis has significant implications for economics, artificial intelligence, complex system theory, neurobiology, and biological evolution and development.Comment: 11 pages, 5 figures. Replaced for minor notational correctio

arXiv.org e-Print Archive

Munich RePEc Personal Archive

Econometrics for Learning Agents

Author: Nekipelov Denis
Syrgkanis Vasilis
Tardos Eva
Publication venue
Publication date: 04/05/2015
Field of study

The main goal of this paper is to develop a theory of inference of player valuations from observed data in the generalized second price auction without relying on the Nash equilibrium assumption. Existing work in Economics on inferring agent values from data relies on the assumption that all participant strategies are best responses of the observed play of other players, i.e. they constitute a Nash equilibrium. In this paper, we show how to perform inference relying on a weaker assumption instead: assuming that players are using some form of no-regret learning. Learning outcomes emerged in recent years as an attractive alternative to Nash equilibrium in analyzing game outcomes, modeling players who haven't reached a stable equilibrium, but rather use algorithmic learning, aiming to learn the best way to play from previous observations. In this paper we show how to infer values of players who use algorithmic learning strategies. Such inference is an important first step before we move to testing any learning theoretic behavioral model on auction data. We apply our techniques to a dataset from Microsoft's sponsored search ad auction system

arXiv.org e-Print Archive

Crossref