11,529 research outputs found
Coordination of independent learners in cooperative Markov games.
In the framework of fully cooperative multi-agent systems, independent agents learning by reinforcement must overcome several difficulties as the coordination or the impact of exploration. The study of these issues allows first to synthesize the characteristics of existing reinforcement learning decentralized methods for independent learners in cooperative Markov games. Then, given the difficulties encountered by these approaches, we focus on two main skills: optimistic agents, which manage the coordination in deterministic environments, and the detection of the stochasticity of a game. Indeed, the key difficulty in stochastic environment is to distinguish between various causes of noise. The SOoN algorithm is so introduced, standing for “Swing between Optimistic or Neutral”, in which independent learners can adapt automatically to the environment stochasticity. Empirical results on various cooperative Markov games notably show that SOoN overcomes the main factors of non-coordination and is robust face to the exploration of other agents
The international stock pollutant control: a stochastic formulation
In this paper we provide a stochastic dynamic game formulation of the economics of
international environmental agreements on the transnational pollution control when the
environmental damage arises from stock pollutant that accumulates, for accumulating
pollutants such as CO2 in the atmosphere. To improve the cooperative and the noncooperative
equilibrium among countries, we propose the criteria of the minimization of
the expected discounted total cost. Moreover, we consider Stochastic Dynamic Games
formulated as Stochastic Dynamic Programming and Cooperative versus Noncooperative
Stochastic Dynamic Games. The performance of the proposed schemes is
illustrated by a real data based example
Resolution of the stochastic strategy spatial prisoner's dilemma by means of particle swarm optimization
We study the evolution of cooperation among selfish individuals in the
stochastic strategy spatial prisoner's dilemma game. We equip players with the
particle swarm optimization technique, and find that it may lead to highly
cooperative states even if the temptations to defect are strong. The concept of
particle swarm optimization was originally introduced within a simple model of
social dynamics that can describe the formation of a swarm, i.e., analogous to
a swarm of bees searching for a food source. Essentially, particle swarm
optimization foresees changes in the velocity profile of each player, such that
the best locations are targeted and eventually occupied. In our case, each
player keeps track of the highest payoff attained within a local topological
neighborhood and its individual highest payoff. Thus, players make use of their
own memory that keeps score of the most profitable strategy in previous
actions, as well as use of the knowledge gained by the swarm as a whole, to
find the best available strategy for themselves and the society. Following
extensive simulations of this setup, we find a significant increase in the
level of cooperation for a wide range of parameters, and also a full resolution
of the prisoner's dilemma. We also demonstrate extreme efficiency of the
optimization algorithm when dealing with environments that strongly favor the
proliferation of defection, which in turn suggests that swarming could be an
important phenomenon by means of which cooperation can be sustained even under
highly unfavorable conditions. We thus present an alternative way of
understanding the evolution of cooperative behavior and its ubiquitous presence
in nature, and we hope that this study will be inspirational for future efforts
aimed in this direction.Comment: 12 pages, 4 figures; accepted for publication in PLoS ON
- …