67 research outputs found

    Stationary Anonymous Sequential Games with Undiscounted Rewards

    Get PDF
    International audienceStationary anonymous sequential games with undiscounted rewards are a special class of games that combines features from both population games (in nitely many players) with stochastic games.We extend the theory for these games to the cases of total expected reward as well as to the expected average reward. We show that equilibria in the anonymous sequential game correspond to the limits of equilibria of related nite population games as the number of players grows to in nity. We provide examples to illustrate our results

    Applications of Stationary Anonymous Sequential Games to Multiple Access Control in Wireless Communications

    Get PDF
    International audienceWe consider in this paper dynamic Multiple Access (MAC) games between a random number of players competing over collision channels. Each of several mobiles involved in an interaction determines whether to transmit at a high or at a low power. High power decreases the lifetime of the battery but results in smaller collision probability. We formulate this game as an anonymous sequential game with undiscounted reward which we recently introduced and which combines features from both population games (infinitely many players) and stochastic games. We briefly present this class of games and basic equilibrium existence results for the total expected reward as well as for the expected average reward. We then apply the theory in the MAC game

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

    Efficient methods for near-optimal sequential decision making under uncertainty

    Get PDF
    This chapter discusses decision making under uncertainty. More specifically, it offers an overview of efficient Bayesian and distribution-free algorithms for making near-optimal sequential decisions under uncertainty about the environment. Due to the uncertainty, such algorithms must not only learn from their interaction with the environment but also perform as well as possible while learning is taking place. © 2010 Springer-Verlag Berlin Heidelberg

    Private Information in Sequential Common-Value Auctions

    Get PDF
    We study an infinitely-repeated ?rst-price auction with common values. Initially, bid- ders receive independent private signals about the objects' value, which itself does not change over time. Learning occurs only through observation of the bids. Under one-sided incomplete information, this information is eventually revealed and the seller extracts es- sentially the entire rent (for large discount factors). Both players?payo¤s tend to zero as the discount factor tends to one. However, the uninformed bidder does relatively better than the informed bidder. We discuss the case of two-sided incomplete information, and argue that, under a Markovian re?nement, the outcome is pooling: information is revealed only insofar as it does not affect prices. Bidders submit a common, low bid in the tradition of collusion without conspiracy.repeated game with incomplete information; private information; ratchet effect; first-price auction; dynamic auctions

    Discrete-time controlled markov processes with average cost criterion: a survey

    Get PDF
    This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation

    A Minimum Relative Entropy Principle for Learning and Acting

    Full text link
    This paper proposes a method to construct an adaptive agent that is universal with respect to a given class of experts, where each expert is an agent that has been designed specifically for a particular environment. This adaptive control problem is formalized as the problem of minimizing the relative entropy of the adaptive agent from the expert that is most suitable for the unknown environment. If the agent is a passive observer, then the optimal solution is the well-known Bayesian predictor. However, if the agent is active, then its past actions need to be treated as causal interventions on the I/O stream rather than normal probability conditions. Here it is shown that the solution to this new variational problem is given by a stochastic controller called the Bayesian control rule, which implements adaptive behavior as a mixture of experts. Furthermore, it is shown that under mild assumptions, the Bayesian control rule converges to the control law of the most suitable expert.Comment: 36 pages, 11 figure
    • …
    corecore