12,477 research outputs found
Towards time-varying proximal dynamics in Multi-Agent Network Games
Distributed decision making in multi-agent networks has recently attracted
significant research attention thanks to its wide applicability, e.g. in the
management and optimization of computer networks, power systems, robotic teams,
sensor networks and consumer markets. Distributed decision-making problems can
be modeled as inter-dependent optimization problems, i.e., multi-agent
game-equilibrium seeking problems, where noncooperative agents seek an
equilibrium by communicating over a network. To achieve a network equilibrium,
the agents may decide to update their decision variables via proximal dynamics,
driven by the decision variables of the neighboring agents. In this paper, we
provide an operator-theoretic characterization of convergence with a
time-invariant communication network. For the time-varying case, we consider
adjacency matrices that may switch subject to a dwell time. We illustrate our
investigations using a distributed robotic exploration example.Comment: 6 pages, 3 figure
Aspiration Dynamics of Multi-player Games in Finite Populations
Studying strategy update rules in the framework of evolutionary game theory,
one can differentiate between imitation processes and aspiration-driven
dynamics. In the former case, individuals imitate the strategy of a more
successful peer. In the latter case, individuals adjust their strategies based
on a comparison of their payoffs from the evolutionary game to a value they
aspire, called the level of aspiration. Unlike imitation processes of pairwise
comparison, aspiration-driven updates do not require additional information
about the strategic environment and can thus be interpreted as being more
spontaneous. Recent work has mainly focused on understanding how aspiration
dynamics alter the evolutionary outcome in structured populations. However, the
baseline case for understanding strategy selection is the well-mixed population
case, which is still lacking sufficient understanding. We explore how
aspiration-driven strategy-update dynamics under imperfect rationality
influence the average abundance of a strategy in multi-player evolutionary
games with two strategies. We analytically derive a condition under which a
strategy is more abundant than the other in the weak selection limiting case.
This approach has a long standing history in evolutionary game and is mostly
applied for its mathematical approachability. Hence, we also explore strong
selection numerically, which shows that our weak selection condition is a
robust predictor of the average abundance of a strategy. The condition turns
out to differ from that of a wide class of imitation dynamics, as long as the
game is not dyadic. Therefore a strategy favored under imitation dynamics can
be disfavored under aspiration dynamics. This does not require any population
structure thus highlights the intrinsic difference between imitation and
aspiration dynamics
Move ordering and communities in complex networks describing the game of go
We analyze the game of go from the point of view of complex networks. We
construct three different directed networks of increasing complexity, defining
nodes as local patterns on plaquettes of increasing sizes, and links as actual
successions of these patterns in databases of real games. We discuss the
peculiarities of these networks compared to other types of networks. We explore
the ranking vectors and community structure of the networks and show that this
approach enables to extract groups of moves with common strategic properties.
We also investigate different networks built from games with players of
different levels or from different phases of the game. We discuss how the study
of the community structure of these networks may help to improve the computer
simulations of the game. More generally, we believe such studies may help to
improve the understanding of human decision process.Comment: 14 pages, 21 figure
Simulation of an Optional Strategy in the Prisoner's Dilemma in Spatial and Non-spatial Environments
This paper presents research comparing the effects of different environments
on the outcome of an extended Prisoner's Dilemma, in which agents have the
option to abstain from playing the game. We consider three different pure
strategies: cooperation, defection and abstinence. We adopt an evolutionary
game theoretic approach and consider two different environments: the first
which imposes no spatial constraints and the second in which agents are placed
on a lattice grid. We analyse the performance of the three strategies as we
vary the loner's payoff in both structured and unstructured environments.
Furthermore we also present the results of simulations which identify scenarios
in which cooperative clusters of agents emerge and persist in both
environments.Comment: 12 pages, 8 figures. International Conference on the Simulation of
Adaptive Behavio
Spillover modes in multiplex games: double-edged effects on cooperation, and their coevolution
In recent years, there has been growing interest in studying games on
multiplex networks that account for interactions across linked social contexts.
However, little is known about how potential cross-context interference, or
spillover, of individual behavioural strategy impact overall cooperation. We
consider three plausible spillover modes, quantifying and comparing their
effects on the evolution of cooperation. In our model, social interactions take
place on two network layers: one represents repeated interactions with close
neighbours in a lattice, the other represents one-shot interactions with random
individuals across the same population. Spillover can occur during the social
learning process with accidental cross-layer strategy transfer, or during
social interactions with errors in implementation due to contextual
interference. Our analytical results, using extended pair approximation, are in
good agreement with extensive simulations. We find double-edged effects of
spillover on cooperation: increasing the intensity of spillover can promote
cooperation provided cooperation is favoured in one layer, but too much
spillover is detrimental. We also discover a bistability phenomenon of
cooperation: spillover hinders or promotes cooperation depending on initial
frequencies of cooperation in each layer. Furthermore, comparing strategy
combinations that emerge in each spillover mode provides a good indication of
their co-evolutionary dynamics with cooperation. Our results make testable
predictions that inspire future research, and sheds light on human cooperation
across social domains and their interference with one another
Evolutionary establishment of moral and double moral standards through spatial interactions
Situations where individuals have to contribute to joint efforts or share
scarce resources are ubiquitous. Yet, without proper mechanisms to ensure
cooperation, the evolutionary pressure to maximize individual success tends to
create a tragedy of the commons (such as over-fishing or the destruction of our
environment). This contribution addresses a number of related puzzles of human
behavior with an evolutionary game theoretical approach as it has been
successfully used to explain the behavior of other biological species many
times, from bacteria to vertebrates. Our agent-based model distinguishes
individuals applying four different behavioral strategies: non-cooperative
individuals ("defectors"), cooperative individuals abstaining from punishment
efforts (called "cooperators" or "second-order free-riders"), cooperators who
punish non-cooperative behavior ("moralists"), and defectors, who punish other
defectors despite being non-cooperative themselves ("immoralists"). By
considering spatial interactions with neighboring individuals, our model
reveals several interesting effects: First, moralists can fully eliminate
cooperators. This spreading of punishing behavior requires a segregation of
behavioral strategies and solves the "second-order free-rider problem". Second,
the system behavior changes its character significantly even after very long
times ("who laughs last laughs best effect"). Third, the presence of a number
of defectors can largely accelerate the victory of moralists over non-punishing
cooperators. Forth, in order to succeed, moralists may profit from immoralists
in a way that appears like an "unholy collaboration". Our findings suggest that
the consideration of punishment strategies allows to understand the
establishment and spreading of "moral behavior" by means of game-theoretical
concepts. This demonstrates that quantitative biological modeling approaches
are powerful even in domains that have been addressed with non-mathematical
concepts so far. The complex dynamics of certain social behaviors becomes
understandable as result of an evolutionary competition between different
behavioral strategies.Comment: 15 pages, 5 figures; accepted for publication in PLoS Computational
Biology [supplementary material available at
http://www.soms.ethz.ch/research/secondorder-freeriders/ and
http://www.matjazperc.com/plos/moral.html
Interaction and Experience in Enactive Intelligence and Humanoid Robotics
We overview how sensorimotor experience can be operationalized for interaction scenarios in which humanoid robots acquire skills and linguistic behaviours via enacting a “form-of-life”’ in interaction games (following Wittgenstein) with humans. The enactive paradigm is introduced which provides a powerful framework for the construction of complex adaptive systems, based on interaction, habit, and experience. Enactive cognitive architectures (following insights of Varela, Thompson and Rosch) that we have developed support social learning and robot ontogeny by harnessing information-theoretic methods and raw uninterpreted sensorimotor experience to scaffold the acquisition of behaviours. The success criterion here is validation by the robot engaging in ongoing human-robot interaction with naive participants who, over the course of iterated interactions, shape the robot’s behavioural and linguistic development. Engagement in such interaction exhibiting aspects of purposeful, habitual recurring structure evidences the developed capability of the humanoid to enact language and interaction games as a successful participant
Learning with Opponent-Learning Awareness
Multi-agent settings are quickly gathering importance in machine learning.
This includes a plethora of recent work on deep multi-agent reinforcement
learning, but also can be extended to hierarchical RL, generative adversarial
networks and decentralised optimisation. In all these settings the presence of
multiple learning agents renders the training problem non-stationary and often
leads to unstable training or undesired final results. We present Learning with
Opponent-Learning Awareness (LOLA), a method in which each agent shapes the
anticipated learning of the other agents in the environment. The LOLA learning
rule includes a term that accounts for the impact of one agent's policy on the
anticipated parameter update of the other agents. Results show that the
encounter of two LOLA agents leads to the emergence of tit-for-tat and
therefore cooperation in the iterated prisoners' dilemma, while independent
learning does not. In this domain, LOLA also receives higher payouts compared
to a naive learner, and is robust against exploitation by higher order
gradient-based methods. Applied to repeated matching pennies, LOLA agents
converge to the Nash equilibrium. In a round robin tournament we show that LOLA
agents successfully shape the learning of a range of multi-agent learning
algorithms from literature, resulting in the highest average returns on the
IPD. We also show that the LOLA update rule can be efficiently calculated using
an extension of the policy gradient estimator, making the method suitable for
model-free RL. The method thus scales to large parameter and input spaces and
nonlinear function approximators. We apply LOLA to a grid world task with an
embedded social dilemma using recurrent policies and opponent modelling. By
explicitly considering the learning of the other agent, LOLA agents learn to
cooperate out of self-interest. The code is at github.com/alshedivat/lola
- …