12,477 research outputs found

    Towards time-varying proximal dynamics in Multi-Agent Network Games

    Get PDF
    Distributed decision making in multi-agent networks has recently attracted significant research attention thanks to its wide applicability, e.g. in the management and optimization of computer networks, power systems, robotic teams, sensor networks and consumer markets. Distributed decision-making problems can be modeled as inter-dependent optimization problems, i.e., multi-agent game-equilibrium seeking problems, where noncooperative agents seek an equilibrium by communicating over a network. To achieve a network equilibrium, the agents may decide to update their decision variables via proximal dynamics, driven by the decision variables of the neighboring agents. In this paper, we provide an operator-theoretic characterization of convergence with a time-invariant communication network. For the time-varying case, we consider adjacency matrices that may switch subject to a dwell time. We illustrate our investigations using a distributed robotic exploration example.Comment: 6 pages, 3 figure

    Aspiration Dynamics of Multi-player Games in Finite Populations

    Full text link
    Studying strategy update rules in the framework of evolutionary game theory, one can differentiate between imitation processes and aspiration-driven dynamics. In the former case, individuals imitate the strategy of a more successful peer. In the latter case, individuals adjust their strategies based on a comparison of their payoffs from the evolutionary game to a value they aspire, called the level of aspiration. Unlike imitation processes of pairwise comparison, aspiration-driven updates do not require additional information about the strategic environment and can thus be interpreted as being more spontaneous. Recent work has mainly focused on understanding how aspiration dynamics alter the evolutionary outcome in structured populations. However, the baseline case for understanding strategy selection is the well-mixed population case, which is still lacking sufficient understanding. We explore how aspiration-driven strategy-update dynamics under imperfect rationality influence the average abundance of a strategy in multi-player evolutionary games with two strategies. We analytically derive a condition under which a strategy is more abundant than the other in the weak selection limiting case. This approach has a long standing history in evolutionary game and is mostly applied for its mathematical approachability. Hence, we also explore strong selection numerically, which shows that our weak selection condition is a robust predictor of the average abundance of a strategy. The condition turns out to differ from that of a wide class of imitation dynamics, as long as the game is not dyadic. Therefore a strategy favored under imitation dynamics can be disfavored under aspiration dynamics. This does not require any population structure thus highlights the intrinsic difference between imitation and aspiration dynamics

    Move ordering and communities in complex networks describing the game of go

    Full text link
    We analyze the game of go from the point of view of complex networks. We construct three different directed networks of increasing complexity, defining nodes as local patterns on plaquettes of increasing sizes, and links as actual successions of these patterns in databases of real games. We discuss the peculiarities of these networks compared to other types of networks. We explore the ranking vectors and community structure of the networks and show that this approach enables to extract groups of moves with common strategic properties. We also investigate different networks built from games with players of different levels or from different phases of the game. We discuss how the study of the community structure of these networks may help to improve the computer simulations of the game. More generally, we believe such studies may help to improve the understanding of human decision process.Comment: 14 pages, 21 figure

    Simulation of an Optional Strategy in the Prisoner's Dilemma in Spatial and Non-spatial Environments

    Full text link
    This paper presents research comparing the effects of different environments on the outcome of an extended Prisoner's Dilemma, in which agents have the option to abstain from playing the game. We consider three different pure strategies: cooperation, defection and abstinence. We adopt an evolutionary game theoretic approach and consider two different environments: the first which imposes no spatial constraints and the second in which agents are placed on a lattice grid. We analyse the performance of the three strategies as we vary the loner's payoff in both structured and unstructured environments. Furthermore we also present the results of simulations which identify scenarios in which cooperative clusters of agents emerge and persist in both environments.Comment: 12 pages, 8 figures. International Conference on the Simulation of Adaptive Behavio

    Spillover modes in multiplex games: double-edged effects on cooperation, and their coevolution

    Get PDF
    In recent years, there has been growing interest in studying games on multiplex networks that account for interactions across linked social contexts. However, little is known about how potential cross-context interference, or spillover, of individual behavioural strategy impact overall cooperation. We consider three plausible spillover modes, quantifying and comparing their effects on the evolution of cooperation. In our model, social interactions take place on two network layers: one represents repeated interactions with close neighbours in a lattice, the other represents one-shot interactions with random individuals across the same population. Spillover can occur during the social learning process with accidental cross-layer strategy transfer, or during social interactions with errors in implementation due to contextual interference. Our analytical results, using extended pair approximation, are in good agreement with extensive simulations. We find double-edged effects of spillover on cooperation: increasing the intensity of spillover can promote cooperation provided cooperation is favoured in one layer, but too much spillover is detrimental. We also discover a bistability phenomenon of cooperation: spillover hinders or promotes cooperation depending on initial frequencies of cooperation in each layer. Furthermore, comparing strategy combinations that emerge in each spillover mode provides a good indication of their co-evolutionary dynamics with cooperation. Our results make testable predictions that inspire future research, and sheds light on human cooperation across social domains and their interference with one another

    Evolutionary establishment of moral and double moral standards through spatial interactions

    Get PDF
    Situations where individuals have to contribute to joint efforts or share scarce resources are ubiquitous. Yet, without proper mechanisms to ensure cooperation, the evolutionary pressure to maximize individual success tends to create a tragedy of the commons (such as over-fishing or the destruction of our environment). This contribution addresses a number of related puzzles of human behavior with an evolutionary game theoretical approach as it has been successfully used to explain the behavior of other biological species many times, from bacteria to vertebrates. Our agent-based model distinguishes individuals applying four different behavioral strategies: non-cooperative individuals ("defectors"), cooperative individuals abstaining from punishment efforts (called "cooperators" or "second-order free-riders"), cooperators who punish non-cooperative behavior ("moralists"), and defectors, who punish other defectors despite being non-cooperative themselves ("immoralists"). By considering spatial interactions with neighboring individuals, our model reveals several interesting effects: First, moralists can fully eliminate cooperators. This spreading of punishing behavior requires a segregation of behavioral strategies and solves the "second-order free-rider problem". Second, the system behavior changes its character significantly even after very long times ("who laughs last laughs best effect"). Third, the presence of a number of defectors can largely accelerate the victory of moralists over non-punishing cooperators. Forth, in order to succeed, moralists may profit from immoralists in a way that appears like an "unholy collaboration". Our findings suggest that the consideration of punishment strategies allows to understand the establishment and spreading of "moral behavior" by means of game-theoretical concepts. This demonstrates that quantitative biological modeling approaches are powerful even in domains that have been addressed with non-mathematical concepts so far. The complex dynamics of certain social behaviors becomes understandable as result of an evolutionary competition between different behavioral strategies.Comment: 15 pages, 5 figures; accepted for publication in PLoS Computational Biology [supplementary material available at http://www.soms.ethz.ch/research/secondorder-freeriders/ and http://www.matjazperc.com/plos/moral.html

    Interaction and Experience in Enactive Intelligence and Humanoid Robotics

    Get PDF
    We overview how sensorimotor experience can be operationalized for interaction scenarios in which humanoid robots acquire skills and linguistic behaviours via enacting a “form-of-life”’ in interaction games (following Wittgenstein) with humans. The enactive paradigm is introduced which provides a powerful framework for the construction of complex adaptive systems, based on interaction, habit, and experience. Enactive cognitive architectures (following insights of Varela, Thompson and Rosch) that we have developed support social learning and robot ontogeny by harnessing information-theoretic methods and raw uninterpreted sensorimotor experience to scaffold the acquisition of behaviours. The success criterion here is validation by the robot engaging in ongoing human-robot interaction with naive participants who, over the course of iterated interactions, shape the robot’s behavioural and linguistic development. Engagement in such interaction exhibiting aspects of purposeful, habitual recurring structure evidences the developed capability of the humanoid to enact language and interaction games as a successful participant

    Learning with Opponent-Learning Awareness

    Full text link
    Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora of recent work on deep multi-agent reinforcement learning, but also can be extended to hierarchical RL, generative adversarial networks and decentralised optimisation. In all these settings the presence of multiple learning agents renders the training problem non-stationary and often leads to unstable training or undesired final results. We present Learning with Opponent-Learning Awareness (LOLA), a method in which each agent shapes the anticipated learning of the other agents in the environment. The LOLA learning rule includes a term that accounts for the impact of one agent's policy on the anticipated parameter update of the other agents. Results show that the encounter of two LOLA agents leads to the emergence of tit-for-tat and therefore cooperation in the iterated prisoners' dilemma, while independent learning does not. In this domain, LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods. Applied to repeated matching pennies, LOLA agents converge to the Nash equilibrium. In a round robin tournament we show that LOLA agents successfully shape the learning of a range of multi-agent learning algorithms from literature, resulting in the highest average returns on the IPD. We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL. The method thus scales to large parameter and input spaces and nonlinear function approximators. We apply LOLA to a grid world task with an embedded social dilemma using recurrent policies and opponent modelling. By explicitly considering the learning of the other agent, LOLA agents learn to cooperate out of self-interest. The code is at github.com/alshedivat/lola
    corecore