1,386 research outputs found

    Learning by (limited) forward looking players

    Get PDF
    We present a model of adaptive economic agents who are k periods forward looking. Agents in our model are randomly matched to interact in finitely repeated games. They form beliefs by learning from past behavior of others and then best respond to these beliefs looking k periods ahead. We establish almost sure convergence of our stochastic process and characterize absorbing sets. These can be very different from the predictions in both the fully rational model and the adaptive, but myopic case. In particular we find that also Non-Nash outcomes can be sustained whenever they satisfy a "local" efficiency condition. We then characterize stochastically stable states in a class of 2. ×. 2 games and show that under certain conditions the efficient action in Prisoner's Dilemma games and coordination games can be singled out as uniquely stochastically stable. We show that our results are consistent with typical patterns observed in experiments on finitely repeated Prisoner's Dilemma games and in particular can explain what is commonly called the "endgame effect" and the "restart effect". Finally, if populations are composed of some myopic and some forward looking agents, parameter constellations exist such that either might obtain higher average payoffs

    Learning to play games in extensive form by valuation

    Get PDF
    A valuation for a board game is an assignment of numeric values to different states of the board. The valuation reflects the desirability of the states for the player. It can be used by a player to decide on her next move during the play. We assume a myopic player, who chooses a move with the highest valuation. Valuations can also be revised, and hopefully improved, after each play of the game. Here, a very simple valuation revision is considered, in which the states of the board visited in a play are assigned the payoff obtained in the play. We show that by adopting such a learning process a player who has a winning strategy in a win-lose game can almost surely guarantee a win in a repeated game. When a player has more than two payoffs, a more elaborate learning procedure is required. We consider one that associates with each state the average payoff in the rounds in which this node was reached. When all players adopt this learning procedure, with some perturbations, then, with probability 1, strategies that are close to subgame perfect equilibrium are played after some time. A single player who adopts this procedure can guarantee only her individually rational payoff.reinforcement learning

    Stochastic learning in co-ordination games : a simulation approach

    Get PDF
    In the presence of externalities, consumption behaviour depends on the solution of a co-ordination problem. In our paper we suggest a learning approach to the study of co-ordination in consumption contexts where agents adjust their choices on the basis of the reinforcement (payoff) they receive during the game. The results of simulations allowed us to distinguish the roles of different aspects of learning in enabling co-ordination within a population of agents. Our main results highlight: 1. the role played by the speed of learning in determining failures of the co-ordination process; 2. the effect of forgetting past experiences on the speed of the co-ordination process; 3. the role of experimentation in bringing the process of co-ordination into an efficient equilibrium

    Foresighted Demand Side Management

    Full text link
    We consider a smart grid with an independent system operator (ISO), and distributed aggregators who have energy storage and purchase energy from the ISO to serve its customers. All the entities in the system are foresighted: each aggregator seeks to minimize its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO seeks to minimize the long-term total cost of the system (e.g. energy generation costs and the aggregators' costs) by dispatching the energy production among the generators. The decision making of the entities is complicated for two reasons. First, the information is decentralized: the ISO does not know the aggregators' states (i.e. their energy consumption requests from customers and the amount of energy in their storage), and each aggregator does not know the other aggregators' states or the ISO's state (i.e. the energy generation costs and the status of the transmission lines). Second, the coupling among the aggregators is unknown to them. Specifically, each aggregator's energy purchase affects the price, and hence the payments of the other aggregators. However, none of them knows how its decision influences the price because the price is determined by the ISO based on its state. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite being decentralized and involving complex coupling among the various entities
    • …
    corecore