53,776 research outputs found

    Reputation and commitment in two-person repeated games without discounting

    Get PDF
    Two-person repeated games with no discounting are considered where there is uncertainty about the type of the players. If there is a possibility that a player is an automaton committed to a particular pure or mixed stage-game action, then this provides a lower bound on the Nash equilibrium payoffs to a normal type of this player. The lower bound is the best available and is robust to the existence of other types. The results are extended to the case of two-sided uncertainty. This work extends Schmidt (1993) who analyzed the restricted class of conflicting interest games

    From Weak Learning to Strong Learning in Fictitious Play Type Algorithms

    Full text link
    The paper studies the highly prototypical Fictitious Play (FP) algorithm, as well as a broad class of learning processes based on best-response dynamics, that we refer to as FP-type algorithms. A well-known shortcoming of FP is that, while players may learn an equilibrium strategy in some abstract sense, there are no guarantees that the period-by-period strategies generated by the algorithm actually converge to equilibrium themselves. This issue is fundamentally related to the discontinuous nature of the best response correspondence and is inherited by many FP-type algorithms. Not only does it cause problems in the interpretation of such algorithms as a mechanism for economic and social learning, but it also greatly diminishes the practical value of these algorithms for use in distributed control. We refer to forms of learning in which players learn equilibria in some abstract sense only (to be defined more precisely in the paper) as weak learning, and we refer to forms of learning where players' period-by-period strategies converge to equilibrium as strong learning. An approach is presented for modifying an FP-type algorithm that achieves weak learning in order to construct a variant that achieves strong learning. Theoretical convergence results are proved.Comment: 22 page

    On Similarities between Inference in Game Theory and Machine Learning

    No full text
    In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

    Jamming Games in the MIMO Wiretap Channel With an Active Eavesdropper

    Full text link
    This paper investigates reliable and covert transmission strategies in a multiple-input multiple-output (MIMO) wiretap channel with a transmitter, receiver and an adversarial wiretapper, each equipped with multiple antennas. In a departure from existing work, the wiretapper possesses a novel capability to act either as a passive eavesdropper or as an active jammer, under a half-duplex constraint. The transmitter therefore faces a choice between allocating all of its power for data, or broadcasting artificial interference along with the information signal in an attempt to jam the eavesdropper (assuming its instantaneous channel state is unknown). To examine the resulting trade-offs for the legitimate transmitter and the adversary, we model their interactions as a two-person zero-sum game with the ergodic MIMO secrecy rate as the payoff function. We first examine conditions for the existence of pure-strategy Nash equilibria (NE) and the structure of mixed-strategy NE for the strategic form of the game.We then derive equilibrium strategies for the extensive form of the game where players move sequentially under scenarios of perfect and imperfect information. Finally, numerical simulations are presented to examine the equilibrium outcomes of the various scenarios considered.Comment: 27 pages, 8 figures. To appear, IEEE Transactions on Signal Processin

    Incentive and stability in the Rock-Paper-Scissors game: an experimental investigation

    Full text link
    In a two-person Rock-Paper-Scissors (RPS) game, if we set a loss worth nothing and a tie worth 1, and the payoff of winning (the incentive a) as a variable, this game is called as generalized RPS game. The generalized RPS game is a representative mathematical model to illustrate the game dynamics, appearing widely in textbook. However, how actual motions in these games depend on the incentive has never been reported quantitatively. Using the data from 7 games with different incentives, including 84 groups of 6 subjects playing the game in 300-round, with random-pair tournaments and local information recorded, we find that, both on social and individual level, the actual motions are changing continuously with the incentive. More expressively, some representative findings are, (1) in social collective strategy transit views, the forward transition vector field is more and more centripetal as the stability of the system increasing; (2) In the individual behavior of strategy transit view, there exists a phase transformation as the stability of the systems increasing, and the phase transformation point being near the standard RPS; (3) Conditional response behaviors are structurally changing accompanied by the controlled incentive. As a whole, the best response behavior increases and the win-stay lose-shift (WSLS) behavior declines with the incentive. Further, the outcome of win, tie, and lose influence the best response behavior and WSLS behavior. Both as the best response behavior, the win-stay behavior declines with the incentive while the lose-left-shift behavior increase with the incentive. And both as the WSLS behavior, the lose-left-shift behavior increase with the incentive, but the lose-right-shift behaviors declines with the incentive. We hope to learn which one in tens of learning models can interpret the empirical observation above.Comment: 19 pages, 14 figures, Keywords: experimental economics, conditional response, best response, win-stay-lose-shift, evolutionary game theory, behavior economic

    A Foundation for Markov Equilibria in Infinite Horizon Perfect Information Games

    Get PDF
    We study perfect information games with an infinite horizon played by an arbitrary number of players. This class of games includes infinitely repeated perfect information games, repeated games with asynchronous moves, games with long and short run players, games with overlapping generations of players, and canonical non-cooperative models of bargaining. We consider two restrictions on equilibria. An equilibrium is purifiable if close by behavior is consistent with equilibrium when agentsā€™ payoffs at each node are perturbed additively and independently. An equilibrium has bounded recall if there exists K such that at most one playerā€™s strategy depends on what happened more than K periods earlier. We show that only Markov equilibria have bounded memory and are purifiable. Thus if a game has at most one long-run player, all purifiable equilibria are Markov.Markov, bounded recall, purification
    • ā€¦
    corecore