11,529 research outputs found

    Coordination of independent learners in cooperative Markov games.

    No full text
    In the framework of fully cooperative multi-agent systems, independent agents learning by reinforcement must overcome several difficulties as the coordination or the impact of exploration. The study of these issues allows first to synthesize the characteristics of existing reinforcement learning decentralized methods for independent learners in cooperative Markov games. Then, given the difficulties encountered by these approaches, we focus on two main skills: optimistic agents, which manage the coordination in deterministic environments, and the detection of the stochasticity of a game. Indeed, the key difficulty in stochastic environment is to distinguish between various causes of noise. The SOoN algorithm is so introduced, standing for “Swing between Optimistic or Neutral”, in which independent learners can adapt automatically to the environment stochasticity. Empirical results on various cooperative Markov games notably show that SOoN overcomes the main factors of non-coordination and is robust face to the exploration of other agents

    The international stock pollutant control: a stochastic formulation

    Get PDF
    In this paper we provide a stochastic dynamic game formulation of the economics of international environmental agreements on the transnational pollution control when the environmental damage arises from stock pollutant that accumulates, for accumulating pollutants such as CO2 in the atmosphere. To improve the cooperative and the noncooperative equilibrium among countries, we propose the criteria of the minimization of the expected discounted total cost. Moreover, we consider Stochastic Dynamic Games formulated as Stochastic Dynamic Programming and Cooperative versus Noncooperative Stochastic Dynamic Games. The performance of the proposed schemes is illustrated by a real data based example

    Resolution of the stochastic strategy spatial prisoner's dilemma by means of particle swarm optimization

    Get PDF
    We study the evolution of cooperation among selfish individuals in the stochastic strategy spatial prisoner's dilemma game. We equip players with the particle swarm optimization technique, and find that it may lead to highly cooperative states even if the temptations to defect are strong. The concept of particle swarm optimization was originally introduced within a simple model of social dynamics that can describe the formation of a swarm, i.e., analogous to a swarm of bees searching for a food source. Essentially, particle swarm optimization foresees changes in the velocity profile of each player, such that the best locations are targeted and eventually occupied. In our case, each player keeps track of the highest payoff attained within a local topological neighborhood and its individual highest payoff. Thus, players make use of their own memory that keeps score of the most profitable strategy in previous actions, as well as use of the knowledge gained by the swarm as a whole, to find the best available strategy for themselves and the society. Following extensive simulations of this setup, we find a significant increase in the level of cooperation for a wide range of parameters, and also a full resolution of the prisoner's dilemma. We also demonstrate extreme efficiency of the optimization algorithm when dealing with environments that strongly favor the proliferation of defection, which in turn suggests that swarming could be an important phenomenon by means of which cooperation can be sustained even under highly unfavorable conditions. We thus present an alternative way of understanding the evolution of cooperative behavior and its ubiquitous presence in nature, and we hope that this study will be inspirational for future efforts aimed in this direction.Comment: 12 pages, 4 figures; accepted for publication in PLoS ON
    • …
    corecore