103,433 research outputs found

    Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi

    Full text link
    Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect, but playing in an ad-hoc setting requires agents to adapt to its partner's strategies with no previous coordination. Evaluating an agent in this setting requires a diverse population of potential partners, but so far, the behavioral diversity of agents has not been considered in a systematic way. This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate diverse populations for this purpose, and generates a population of diverse Hanabi agents using MAP-Elites. We also postulate that agents can benefit from a diverse population during training and implement a simple "meta-strategy" for adapting to an agent's perceived behavioral niche. We show this meta-strategy can work better than generalist strategies even outside the population it was trained with if its partner's behavioral niche can be correctly inferred, but in practice a partner's behavior depends and interferes with the meta-agent's own behavior, suggesting an avenue for future research in characterizing another agent's behavior during gameplay.Comment: arXiv admin note: text overlap with arXiv:1907.0384

    Reward and Punishment in Minigames

    Get PDF
    Minigames capturing the essence of Public Goods experiments show that even in the absence of rationality assumptions, both punishment and reward will fail to bring about prosocial behavior. This holds in particular for the well-known Ultimatum Game, which emerges as a special case. But reputation can induce fairness and cooperation in populations adapting through learning or imitation. Indeed, the inclusion of reputation effects in the corresponding dynamical models leads to the evolution of economically productive behavior, with agents contributing to the public good and either punishing those who don't, or rewarding those who do. Reward and punishment correspond to two types of bifurcation with intriguing complementarity. The analysis suggests that reputation is essential for fostering social behavior among selfish agents, and that it is considerably more effective with punishment than with rewards

    Bio-Inspired Virtual Populations: Adaptive Behavior with Affective Feedback

    Get PDF
    In this paper, we describe an agency model for generative populations of humanoid characters, based upon temporal variation of affective states. We have built on an existing agent framework from Sequeira et al. [17], and adapted it to be susceptible to temperamental and emotive states in the context of cooperative and non-cooperative interactions based on trading activity. More specifically, this model operates within two existing frameworks: a) intrinsically motivated reinforcement learning, structured upon affective appraisals in the relationship of the agents with their environment [19,17]; b) a multi-temporal representation of individual psychology, common in the field of affective computing, structuring individual psychology as a tripartite relationship: emotions-moods-personality [7,15]. Results show a populations of agents that express their individuality and autonomy with a high level of heterogeneous and spontaneous behaviors, while simultaneously adapting and overcoming their perceptual limitations

    Evolutionary macroeconomic assessment of employment and innovation impacts of climate policy packages

    Get PDF
    Climate policy has been mainly studied with economic models that assume representative, rational agents. Such policy aims, though, at changing carbon-intensive consumption and production patterns driven by bounded rationality and other-regarding preferences, such as status and imitation. To examine climate policy under such alternative behavioral assumptions, we develop a model tool by adapting an existing general-purpose macroeconomic multi-agent model. The resulting tool allows testing various climate policies in terms of combined climate and economic performance. The model is particularly suitable to address the distributional impacts of climate policies, not only because populations of many agents are included, but also as these are composed of different classes of households. The approach accounts for two types of innovations, which improve either the carbon or labor intensity of production. We simulate policy scenarios with distinct combinations of carbon taxation, a reduction of labor taxes, subsidies for green innovation, a price subsidy to consumers for less carbon-intensive products, and green government procurement. The results show pronounced differences with those obtained by rational-agent model studies. It turns out that a supply-oriented subsidy for green innovation, funded by the revenues of a carbon tax, results in a significant reduction of carbon emissions without causing negative effects on em ployment. On the contrary, demand-oriented subsidies for adopting greener technologies, funded in the same manner, result in either none or considerably less re- duction of carbon emissions and may even lead to higher unemployment. Our study also contributes insight on a potential double dividend of shifting taxes from labor to carbon

    Autonomous virulence adaptation improves coevolutionary optimization

    Get PDF

    Comparison of Selection Methods in On-line Distributed Evolutionary Robotics

    Get PDF
    In this paper, we study the impact of selection methods in the context of on-line on-board distributed evolutionary algorithms. We propose a variant of the mEDEA algorithm in which we add a selection operator, and we apply it in a taskdriven scenario. We evaluate four selection methods that induce different intensity of selection pressure in a multi-robot navigation with obstacle avoidance task and a collective foraging task. Experiments show that a small intensity of selection pressure is sufficient to rapidly obtain good performances on the tasks at hand. We introduce different measures to compare the selection methods, and show that the higher the selection pressure, the better the performances obtained, especially for the more challenging food foraging task

    About the Power to Enforce and Prevent Consensus by Manipulating Communication Rules

    Full text link
    We explore the possibilities of enforcing and preventing consensus in continuous opinion dynamics that result from modifications in the communication rules. We refer to the model of Weisbuch and Deffuant, where nn agents adjust their continuous opinions as a result of random pairwise encounters whenever their opinions differ not more than a given bound of confidence \eps. A high \eps leads to consensus, while a lower \eps leads to a fragmentation into several opinion clusters. We drop the random encounter assumption and ask: How small may \eps be such that consensus is still possible with a certain communication plan for the entire group? Mathematical analysis shows that \eps may be significantly smaller than in the random pairwise case. On the other hand we ask: How large may \eps be such that preventing consensus is still possible? In answering this question we prove Fortunato's simulation result that consensus cannot be prevented for \eps>0.5 for large groups. % Next we consider opinion dynamics under different individual strategies and examine their power to increase the chances of consensus. One result is that balancing agents increase chances of consensus, especially if the agents are cautious in adapting their opinions. However, curious agents increase chances of consensus only if those agents are not cautious in adapting their opinions.Comment: 21 pages, 6 figure
    corecore