10,221 research outputs found

    Free Energy and the Generalized Optimality Equations for Sequential Decision Making

    Full text link
    The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.Comment: 10 pages, 2 figure

    Evolutionary Game Theory

    Get PDF
    Articl

    Variational optimization of probability measure spaces resolves the chain store paradox

    Get PDF
    In game theory, players have continuous expected payoff functions and can use fixed point theorems to locate equilibria. This optimization method requires that players adopt a particular type of probability measure space. Here, we introduce alternate probability measure spaces altering the dimensionality, continuity, and differentiability properties of what are now the game's expected payoff functionals. Optimizing such functionals requires generalized variational and functional optimization methods to locate novel equilibria. These variational methods can reconcile game theoretic prediction and observed human behaviours, as we illustrate by resolving the chain store paradox. Our generalized optimization analysis has significant implications for economics, artificial intelligence, complex system theory, neurobiology, and biological evolution and development.Comment: 11 pages, 5 figures. Replaced for minor notational correctio

    Econometrics for Learning Agents

    Full text link
    The main goal of this paper is to develop a theory of inference of player valuations from observed data in the generalized second price auction without relying on the Nash equilibrium assumption. Existing work in Economics on inferring agent values from data relies on the assumption that all participant strategies are best responses of the observed play of other players, i.e. they constitute a Nash equilibrium. In this paper, we show how to perform inference relying on a weaker assumption instead: assuming that players are using some form of no-regret learning. Learning outcomes emerged in recent years as an attractive alternative to Nash equilibrium in analyzing game outcomes, modeling players who haven't reached a stable equilibrium, but rather use algorithmic learning, aiming to learn the best way to play from previous observations. In this paper we show how to infer values of players who use algorithmic learning strategies. Such inference is an important first step before we move to testing any learning theoretic behavioral model on auction data. We apply our techniques to a dataset from Microsoft's sponsored search ad auction system
    • …
    corecore